Skip to content

Commit f425959

Browse files
authored
feat: refactor word count logic and add caching service
1 parent c74ebb6 commit f425959

13 files changed

Lines changed: 671 additions & 307 deletions

README.md

Lines changed: 12 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@
1313
这个插件正是基于这个理念:
1414
- 让复杂的逻辑在后端处理
1515
- 前端模板只负责展示
16-
- 提供简洁的 Finder API
16+
- 为主题提供简洁的 Finder API
1717
- 减少不必要的 JavaScript 依赖
1818

1919
<details><summary>前端动态加载方式 vs 后端服务端渲染方式</summary>
@@ -35,7 +35,6 @@
3535

3636
## TODO
3737

38-
- [ ] 缓存文章字数统计 API 结果,仅在文章更新时刷新统计 API 结果
3938
- [ ] 提供随机文章 API
4039
- [ ] 提供预计阅读时间 API,及相关配置项
4140

@@ -58,18 +57,18 @@
5857
<!--/* 先检测插件可用性,再使用 API */-->
5958
<th:block th:if="${pluginFinder.available('extra-api')}">
6059
<span
61-
th:text="|总字数:${extraApiStatsFinder.wordCount()}|"
60+
th:text="|总字数:${extraApiStatsFinder.postWordCount()}|"
6261
></span>
6362
</th:block>
6463

6564
<!--/* 写在一个标签内也可以,th:if 的优先级比 th:text 高 */-->
6665
<span
6766
th:if="${pluginFinder.available('extra-api')}"
68-
th:text="|总字数:${extraApiStatsFinder.wordCount()}|"
67+
th:text="|总字数:${extraApiStatsFinder.postWordCount()}|"
6968
></span>
7069

7170
<!--/* 自然模板写法 */-->
72-
<span th:if="${pluginFinder.available('extra-api')}">总字数:[[${extraApiStatsFinder.wordCount()}]]</span>
71+
<span th:if="${pluginFinder.available('extra-api')}">总字数:[[${extraApiStatsFinder.postWordCount()}]]</span>
7372
```
7473

7574
**说明**
@@ -83,14 +82,14 @@
8382
#### 文章字数统计
8483

8584
```javascript
86-
extraApiStatsFinder.wordCount({
85+
extraApiStatsFinder.postWordCount({
8786
name: 'post-metadata-name', // 可选,未传入则统计全部文章字数总和
8887
version: 'release' | 'draft' // 可选,默认 'release'
8988
});
9089
```
9190

9291
```javascript
93-
extraApiStatsFinder.wordCount();
92+
extraApiStatsFinder.postWordCount();
9493
```
9594

9695
**描述**
@@ -106,6 +105,7 @@ extraApiStatsFinder.wordCount();
106105
- 输入为空或文章不存在时返回 0,不会抛出异常。
107106
- 性能说明:
108107
- 单次调用开销较小,适合在模板中直接使用。
108+
- 启动时自动计算并缓存,仅在文章内容更新时重新计算。
109109

110110
**参数**
111111
- `name:string` – 文章 `metadata.name`(可选,不传则统计全站)
@@ -117,18 +117,18 @@ extraApiStatsFinder.wordCount();
117117
**使用示例**
118118
```html
119119
<!--/* 统计文章已发布版本的字,适用于 /templates/post.html */-->
120-
<span th:text="${extraApiStatsFinder.wordCount({name: post.metadata.name})}"></span>
120+
<span th:text="${extraApiStatsFinder.postWordCount({name: post.metadata.name})}"></span>
121121

122122
<!--/* 统计文章最新版本的字数(含草稿),适用于 /templates/post.html */-->
123-
<span th:text="${extraApiStatsFinder.wordCount({name: post.metadata.name, version: 'draft'})}"></span>
123+
<span th:text="${extraApiStatsFinder.postWordCount({name: post.metadata.name, version: 'draft'})}"></span>
124124

125125
<!--/* 统计全站已发布文章的总字数,适用于全部模板 */-->
126-
<span th:text="${extraApiStatsFinder.wordCount()}"></span>
126+
<span th:text="${extraApiStatsFinder.postWordCount()}"></span>
127127
<!--/* 与下方写法等价 */-->
128-
<span th:text="${extraApiStatsFinder.wordCount({})}"></span>
128+
<span th:text="${extraApiStatsFinder.postWordCount({})}"></span>
129129

130130
<!--/* 统计全站所有文章最新版本的总字数(含草稿),适用于全部模板 */-->
131-
<span th:text="${extraApiStatsFinder.wordCount({version: 'draft'})}"></span>
131+
<span th:text="${extraApiStatsFinder.postWordCount({version: 'draft'})}"></span>
132132
```
133133

134134
## 开发环境

build.gradle

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
plugins {
22
id 'java'
3-
id "io.freefair.lombok" version "8.13"
3+
id "io.freefair.lombok" version "8.14"
44
id "run.halo.plugin.devtools" version "0.6.1"
55
}
66

src/main/java/top/howiehz/halo/plugin/extra/api/HaloPluginExtraApiPlugin.java

Lines changed: 20 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -3,27 +3,41 @@
33
import org.springframework.stereotype.Component;
44
import run.halo.app.plugin.BasePlugin;
55
import run.halo.app.plugin.PluginContext;
6+
import top.howiehz.halo.plugin.extra.api.service.PostWordCountService;
67

78
/**
8-
* <p>Plugin main class to manage the lifecycle of the plugin.</p>
9-
* <p>This class must be public and have a public constructor.</p>
9+
* Plugin main class to manage the lifecycle of the plugin.
10+
* 插件主类,负责管理插件的生命周期。
1011
* <p>Only one main class extending {@link BasePlugin} is allowed per plugin.</p>
11-
*
12-
* @author HowieHz
13-
* @since 1.0.0
12+
* <p>每个插件只能有一个继承 {@link BasePlugin} 的主类。</p>
1413
*/
1514
@Component
1615
public class HaloPluginExtraApiPlugin extends BasePlugin {
1716

18-
public HaloPluginExtraApiPlugin(PluginContext pluginContext) {
17+
private final PostWordCountService postWordCountService;
18+
19+
public HaloPluginExtraApiPlugin(PluginContext pluginContext, PostWordCountService postWordCountService) {
1920
super(pluginContext);
21+
this.postWordCountService = postWordCountService;
2022
}
2123

24+
/**
25+
* Called when the plugin is starting.
26+
* 插件启动时调用。
27+
*/
2228
@Override
2329
public void start() {
2430
System.out.println("插件启动成功!");
31+
// Preload all caches when the plugin starts
32+
// 插件启动时预加载所有缓存
33+
// post word count cache / 文章字数缓存
34+
postWordCountService.warmUpAllCache();
2535
}
2636

37+
/**
38+
* Called when the plugin is stopping.
39+
* 插件停止时调用。
40+
*/
2741
@Override
2842
public void stop() {
2943
System.out.println("插件停止!");

src/main/java/top/howiehz/halo/plugin/extra/api/finder/ExtraApiStatsFinder.java

Lines changed: 16 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -2,30 +2,37 @@
22

33
import reactor.core.publisher.Mono;
44
import java.util.Collections;
5+
import java.math.BigInteger;
56

67
/**
78
* Finder for calculating post word/character counts for themes to use.
9+
* 供主题使用的文章字数/字符数统计 Finder。
810
*/
911
public interface ExtraApiStatsFinder {
1012
/**
1113
* Unified word count API.
14+
* 统一的字数统计接口。
15+
* <p>
1216
* Parameters (all optional):
13-
* - name: metadata.name of the post
14-
* - version: 'release' or 'draft' (default 'release')
17+
* 参数(均为可选):
18+
* - name: metadata.name of the post / 文章的 metadata.name
19+
* - version: 'release' or 'draft' (default 'release') / 版本:发布版或草稿(默认发布版)
1520
* If name is provided, count the specific post; otherwise, count all posts.
21+
* 若提供 name 参数,则统计对应文章;否则统计所有文章的总字数。
22+
* </p>
1623
*
17-
* @param params parameter map from templates
18-
* @return word count as Mono (non-negative)
24+
* @param params parameter map from templates / 来自模板的参数映射
25+
* @return word count as Mono (non-negative) / 返回字数(非负)的 Mono
1926
*/
20-
Mono<Integer> wordCount(java.util.Map<String, Object> params);
27+
Mono<BigInteger> postWordCount(java.util.Map<String, Object> params);
2128

2229
/**
2330
* Get total word count of all published posts.
2431
* 获取所有已发布文章的总字数。
2532
*
26-
* @return word count as Mono (non-negative)
33+
* @return word count as Mono (non-negative) / 返回字数(非负)的 Mono
2734
*/
28-
default Mono<Integer> wordCount(){
29-
return wordCount(Collections.emptyMap());
30-
};
35+
default Mono<BigInteger> postWordCount() {
36+
return postWordCount(Collections.emptyMap());
37+
}
3138
}
Lines changed: 17 additions & 155 deletions
Original file line numberDiff line numberDiff line change
@@ -1,190 +1,52 @@
11
package top.howiehz.halo.plugin.extra.api.finder;
22

33
import java.util.*;
4-
import java.util.regex.Pattern;
5-
import org.springframework.data.domain.Sort;
4+
import java.math.BigInteger;
65
import org.springframework.stereotype.Component;
76
import reactor.core.publisher.Mono;
8-
import run.halo.app.content.ContentWrapper;
9-
import run.halo.app.content.PostContentService;
10-
import run.halo.app.core.extension.content.Post;
11-
import run.halo.app.extension.ListOptions;
12-
import run.halo.app.extension.ReactiveExtensionClient;
137
import run.halo.app.theme.finders.Finder;
8+
import top.howiehz.halo.plugin.extra.api.service.PostWordCountService;
149

1510
/**
1611
* Implementation of ExtraApiStatsFinder.
12+
* 统计 Finder 的实现,用于为主题提供字数统计能力。
1713
*/
1814
@Component
1915
@Finder("extraApiStatsFinder")
2016
public class ExtraApiStatsFinderImpl implements ExtraApiStatsFinder {
2117

22-
private final ReactiveExtensionClient client; // 响应式扩展客户端 / Reactive extension client
23-
private final PostContentService postContentService; // 文章内容服务 / Post content service
18+
private final PostWordCountService postWordCountService;
2419

2520
/**
2621
* Constructor to initialize ExtraApiStatsFinderImpl with required dependencies.
27-
* 构造函数,使用必需的依赖项初始化 ExtraApiStatsFinderImpl
22+
* 构造函数,注入所需依赖
2823
*/
29-
public ExtraApiStatsFinderImpl(ReactiveExtensionClient client,
30-
PostContentService postContentService) {
31-
this.client = client; // 注入响应式扩展客户端 / Inject reactive extension client
32-
this.postContentService = postContentService; // 注入文章内容服务 / Inject post content service
33-
}
34-
35-
/**
36-
* Get the word count of content by post name and method name.
37-
* 根据文章名称和方法名获取内容的字数统计。
38-
*
39-
* @param name the post name / 文章名称
40-
* @param methodName the method name to invoke / 要调用的方法名
41-
* @return word count as Mono / 返回字数统计的 Mono
42-
*/
43-
private Mono<Integer> postContentCountByName(String name, String methodName) {
44-
if (name == null || name.isBlank()) {
45-
return Mono.just(0); // 空名称直接返回0 / Return 0 for empty name
46-
}
47-
48-
// 使用函数接口映射方法名到对应的服务调用 / Use function interface to map method name to service call
49-
Mono<ContentWrapper> contentMono = switch (methodName) {
50-
case "getHeadContent" -> postContentService.getHeadContent(name);
51-
case "getReleaseContent" -> postContentService.getReleaseContent(name);
52-
default -> Mono.empty(); // 不支持的方法名返回空 / Return empty for unsupported method
53-
};
54-
55-
return contentMono.map(ContentWrapper::getContent) // 提取 content 字段 / Extract content field
56-
.map(content -> countWords(
57-
extractText(content))) // 从 HTML 提取文本并计数 / Extract text and count
58-
.onErrorReturn(0) // 出错时返回 0 / Return 0 on error
59-
.defaultIfEmpty(0); // 空结果时返回 0 / Return 0 if empty
60-
}
61-
62-
// Patterns for stripping HTML quickly
63-
// 快速移除HTML的正则表达式模式 / Patterns for quickly stripping HTML
64-
private static final Pattern HTML_CONTENT_REMOVAL =
65-
Pattern.compile("(?is)<(?:script|style)\\b[^>]*>.*?</(?:script|style)>|<[^>]+>|&[a-zA-Z0-9#]+;");
66-
67-
/**
68-
* Extract plain text from HTML content by removing tags and entities.
69-
* Removes script and style tags, HTML tags, and normalizes whitespace.
70-
* 从 HTML 内容中提取纯文本,移除标签和实体。
71-
* 移除 script 和 style 标签、HTML 标签,并规范化空白字符。
72-
*
73-
* @param html the HTML content / HTML 内容
74-
* @return plain text / 返回纯文本
75-
*/
76-
static String extractText(String html) {
77-
if (html == null || html.isBlank()) {
78-
return "";
79-
}
80-
81-
// 一次性处理所有HTML标签和实体
82-
return HTML_CONTENT_REMOVAL.matcher(html).replaceAll(" ");
83-
}
84-
85-
/**
86-
* Count words in text, supporting both CJK characters and ASCII words.
87-
* CJK characters are counted individually, ASCII letters/digits are grouped as words.
88-
* 统计文本中的词数,支持中日韩字符和 ASCII 单词。
89-
* 中日韩字符单独计数,ASCII 字母/数字按单词分组计数。
90-
*
91-
* @param text the input text / 输入文本
92-
* @return word count / 返回词数统计
93-
*/
94-
static int countWords(String text) {
95-
if (text == null || text.isEmpty()) {
96-
return 0;
97-
}
98-
int count = 0; // 字数计数器 / Word count counter
99-
boolean inAsciiWord = false; // 是否在ASCII单词中 / Whether in an ASCII word
100-
int length = text.length();
101-
for (int i = 0; i < length; ) {
102-
int codePoint = text.codePointAt(i); // 获取当前字符的码点 / Get code point of current character
103-
if (isCJK(codePoint)) {
104-
// count each CJK code point as one word/char
105-
// 每个CJK码点计为一个字/词 / Count each CJK code point as one word/char
106-
count++;
107-
inAsciiWord = false; // 重置ASCII单词状态 / Reset ASCII word state
108-
} else if (Character.isLetterOrDigit(codePoint)) {
109-
// group consecutive ASCII letters/digits as one word
110-
// 连续的ASCII字母/数字作为一个单词 / Group consecutive ASCII letters/digits as one word
111-
if (!inAsciiWord) {
112-
count++; // 开始新的ASCII单词 / Start a new ASCII word
113-
inAsciiWord = true; // 设置在ASCII单词中 / Set in ASCII word
114-
}
115-
} else {
116-
inAsciiWord = false; // 非字母数字字符,重置状态 / Non-alphanumeric character, reset state
117-
}
118-
// 使用位运算优化字符长度计算 / Optimize character length calculation with bitwise operations
119-
i += (codePoint <= 0xFFFF) ? 1 : 2;
120-
}
121-
return Math.max(count, 0); // 确保返回非负数 / Ensure non-negative result
122-
}
123-
124-
125-
/**
126-
* Check if a Unicode code point belongs to CJK (Chinese, Japanese, Korean) character blocks.
127-
* Includes various CJK unified ideographs, compatibility ideographs, and phonetic extensions.
128-
* 检查 Unicode 码点是否属于中日韩 (CJK) 字符块。
129-
* 包括各种 CJK 统一表意文字、兼容表意文字和音标扩展。
130-
* Optimized CJK character detection using range checks.
131-
* 使用范围检查优化的CJK字符检测。
132-
*/
133-
private static boolean isCJK(int codePoint) {
134-
// 常见CJK范围的快速检查 / Fast check for common CJK ranges
135-
return (codePoint >= 0x4E00 && codePoint <= 0x9FFF) || // CJK Unified Ideographs
136-
(codePoint >= 0x3400 && codePoint <= 0x4DBF) || // CJK Extension A
137-
(codePoint >= 0x20000 && codePoint <= 0x2A6DF) || // CJK Extension B
138-
(codePoint >= 0x2A700 && codePoint <= 0x2B73F) || // CJK Extension C
139-
(codePoint >= 0x2B740 && codePoint <= 0x2B81F) || // CJK Extension D
140-
(codePoint >= 0x2B820 && codePoint <= 0x2CEAF) || // CJK Extension E
141-
(codePoint >= 0x2CEB0 && codePoint <= 0x2EBEF) || // CJK Extension F
142-
(codePoint >= 0xF900 && codePoint <= 0xFAFF) || // CJK Compatibility Ideographs
143-
(codePoint >= 0x2F800 && codePoint <= 0x2FA1F) ||
144-
// CJK Compatibility Ideographs Supplement
145-
(codePoint >= 0x3040 && codePoint <= 0x309F) || // Hiragana
146-
(codePoint >= 0x30A0 && codePoint <= 0x30FF) || // Katakana
147-
(codePoint >= 0x31F0 && codePoint <= 0x31FF) || // Katakana Phonetic Extensions
148-
(codePoint >= 0xAC00 && codePoint <= 0xD7AF) || // Hangul Syllables
149-
(codePoint >= 0x1100 && codePoint <= 0x11FF) || // Hangul Jamo
150-
(codePoint >= 0x3130 && codePoint <= 0x318F); // Hangul Compatibility Jamo
24+
public ExtraApiStatsFinderImpl(PostWordCountService postWordCountService) {
25+
this.postWordCountService = postWordCountService;
15126
}
15227

15328
/**
15429
* Unified word count API without slug support.
15530
* If name provided, count by name; otherwise sum word counts across all posts
15631
* (release/draft selectable by version).
157-
* 统一的字数统计API,不支持slug。
158-
* 如果提供name参数则统计指定文章,否则统计所有文章的字数总和
32+
* 统一的字数统计 API
33+
* 若提供 name 参数则按名称统计,否则统计所有文章的总字数(version 可选 release/draft)
15934
*
160-
* @param params parameter map: name? version? ('release'|'draft', default 'release')
161-
* @return word count as Mono (non-negative)
35+
* @param params parameter map: name? version? ('release'|'draft', default 'release') /
36+
* 参数映射:name?version?('release' 或 'draft',默认 'release')
37+
* @return word count as Mono (non-negative) / 返回字数(非负)的 Mono
16238
*/
16339
@Override
164-
public Mono<Integer> wordCount(Map<String, Object> params) {
40+
public Mono<BigInteger> postWordCount(Map<String, Object> params) {
16541
Map<String, Object> map = params == null ? java.util.Collections.emptyMap() : params;
166-
String name = String.valueOf(map.get("name"));
42+
String postName = String.valueOf(map.get("name"));
16743
boolean isDraft =
16844
String.valueOf(map.getOrDefault("version", "release")).equalsIgnoreCase("draft");
16945

170-
if ("null".equals(name) || name.isBlank()) {
171-
return sumWordCountsAcrossAllPosts(isDraft);
46+
if ("null".equals(postName) || postName.isBlank()) {
47+
return postWordCountService.getTotalPostWordCount(isDraft);
17248
}
17349

174-
return isDraft ? postContentCountByName(name, "getHeadContent")
175-
: postContentCountByName(name, "getReleaseContent");
176-
}
177-
178-
/**
179-
* sum word counts across all posts with pagination.
180-
* 统计所有文章的字数总和。
181-
*/
182-
private Mono<Integer> sumWordCountsAcrossAllPosts(boolean isDraft) {
183-
return client.listAll(Post.class, ListOptions.builder().build(), Sort.unsorted())
184-
.map(post -> post.getMetadata().getName()) // 提取需要的名称
185-
.flatMapSequential(postName -> isDraft ? postContentCountByName(postName, "getHeadContent")
186-
: postContentCountByName(postName, "getReleaseContent"), 1024) // 1024 并发
187-
.reduce(0, Integer::sum) // 直接累加
188-
.onErrorReturn(0);
50+
return postWordCountService.getPostWordCount(postName, isDraft);
18951
}
19052
}

0 commit comments

Comments
 (0)