WordPress Rest API方式处理HTML实体字符，避免转义 − 生活的美

WordPress Rest API路由方式更新文章，内核会使用sanitize_post函数格式化数据，遇到转换至实体的符号（<、>、&）时，自动转为各自实体（> < &)。

字符<和>是HTML元素开闭标记符，WordPress中有可能转实体失败，比如只有开符<，但没有闭合符>时，整个内容仅部分转实体，十分不便。

简单示例，原文：这是<测试内容”

转实体：这是<测试内容&quot

笔者推荐配合使用下面两个函数，统一管理转HTML实体。

htmlspecialchars() 函数把预定义的字符转换为 HTML 实体。
htmlspecialchars_decode() 函数把预定义的 HTML 实体转换为字符。

添加内容预更新钩子，为了兼容源程序逻辑，只在更新内容中有实体符号时，使用htmlspecialchars特殊处理保存。

function cpury_custom_content_save_pre($data) {
    $s = array('>', '<', '&');
    $special = false;

    foreach ($s as $key => $value) {
        if (!$special) {
            $special = strpos($data, $value) !== false;
        }
    }

    return $special ? htmlspecialchars($data) : $data;
}

add_filter( 'content_save_pre', 'cpury_custom_content_save_pre', 9 ); 
add_filter( 'excerpt_save_pre', 'cpury_custom_content_save_pre', 9 ); 
add_filter( 'title_save_pre', 'cpury_custom_content_save_pre', 9 );

在API获取原文时，再用htmlspecialchars_decode反处理，这里一般可以不用判断实体符号。

return array(
    'code' => 0,
    'data' => array(
    		'title' => htmlspecialchars_decode($post->post_title),
    		'excerpt' => htmlspecialchars_decode($post->post_excerpt),
    		'content' => htmlspecialchars_decode($post->post_content),
    	)
	);

更新：2023-3-9

目测终于搞懂问题所在了，原来是WordPress有自动过滤HTML标签的机制，位于文件/wp-includes/kses.php：

function kses_init_filters() {
	// Normal filtering.
	add_filter( 'title_save_pre', 'wp_filter_kses' );

	// Comment filtering.
	if ( current_user_can( 'unfiltered_html' ) ) {
		add_filter( 'pre_comment_content', 'wp_filter_post_kses' );
	} else {
		add_filter( 'pre_comment_content', 'wp_filter_kses' );
	}

	// Post filtering.
	add_filter( 'content_save_pre', 'wp_filter_post_kses' );
	add_filter( 'excerpt_save_pre', 'wp_filter_post_kses' );
	add_filter( 'content_filtered_save_pre', 'wp_filter_post_kses' );
}

这里针对多个保存内容进行过滤，会导致意外结果。

解决方法很简单，修改内核注释此函数，当然更推荐在主题functions.php中移除钩子：

remove_action('init', 'kses_init');

remove_action('set_current_user', 'kses_init');

参考资料

https://blog.csdn.net/weixin_34388311/article/details/115737681