原文地址 esolangs.org

REXS is an esoteric programming language created by User:Uellenberg built to simplify reading and wri……

REXS is an esoteric programming language created by User:Uellenberg built to simplify reading and writing regular expressions (especially complex ones).
REXS是一种神秘的编程语言,由User:Uellenberg创建,用于简化正则表达式的读写(尤其是复杂的表达式)。

如果你想看神奇的效果,可以先去文章末尾查看我的例子:

去看一下

GitHub地址

image.png

Description 描述

Every action in REXS consists of a function, of which two types exist: bodied and non-bodied functions. All functions consist of a function name (which is always lowercase), followed by an opening parenthesis and optional parameters, then finally a closing parenthesis (for example, NAME(PARAMETERS)). Bodied functions will be followed by braces (NAME(PARAMETERS) {}) and non-bodied functions will be followed by a semicolon (NAME(PARAMETERS);).
REXS中的每个动作都由一个函数组成,其中存在两种类型:有体函数和无体函数。所有的函数都由一个函数名(总是小写)组成,后面是一个开始的小括号和可选的参数,最后是一个结束的小括号(例如,NAME(PARAMETERS))。有体函数后面是大括号(NAME(PARAMETERS) {}),非有体函数后面是分号(NAME(PARAMETERS);)。

Specifications 规格说明

Parameters 参数

The parameters a function can take will depend on the specific function, and if multiple parameters exist, they will be separated by a comma (param1, param2). In general, parameters can be split up into three distinct types (although they might not always fall into these categories, as it is entirely dependent on the specific function):
一个函数可以接受的参数将取决于具体的函数,如果存在多个参数,它们将用逗号隔开(param1, param2)。一般来说,参数可以分成三种不同的类型(尽管它们可能不总是属于这些类别,因为这完全取决于具体的函数)。

Control 控制

Control parameters are specific, per-function parameters that do certain things. For example, assert(START); has the parameter START, which is a control parameter. In general, control parameters will be comprised of only uppercase characters and can be thought of as enums.
控制参数是特定的、每个函数的参数,用于做某些事情。例如,assert(START);有一个参数START,它是一个控制参数。一般来说,控制参数只由大写字母组成,可以被认为是枚举。

String 字符串

String parameters are, as their name suggests, strings. They are surrounded by double quotes on both sides. For example, match("t"); has the parameter "t", which is a string parameter. When this parameter is used by the function, the double quotes on either end will be stripped. It is also important to note that strings allow every character except double quotes within them. If you wish to use a double quote in a function, there will likely be an option to use a control parameter (such as QUOTE) to achieve it.
字符串参数,正如其名称所示,是字符串。它们被两边的双引号所包围。例如,match(“t”);有一个参数 “t”,它是一个字符串参数。当这个参数被函数使用时,两端的双引号将被剥去。同样重要的是要注意,字符串内部允许除双引号以外的所有字符。如果你想在一个函数中使用双引号,很可能会有一个选项,即使用一个控制参数(如QUOTE)来实现它。

Integer 整形

Integer parameters are integer numbers. For example, repeat(10) {} has the parameter 10, which is an integer parameter.
整数参数是整数。例如, repeat(10) {}的参数为10,是一个整数参数。

Functions 函数

Here, the different classes of functions, discussed in the description, will be explained in greater detail.
在此,将更详细地解释描述中讨论的不同类别的功能。

Bodied Functions 有体函数

Bodied functions are used to do some action on their body (which is everything between their braces), based on their parameters. Their body can consist of both bodied and non-bodied functions, although some functions have specific limits on which functions can appear inside their body.
有体函数是用来根据它们的参数对它们的身体(也就是大括号之间的一切)做一些动作。他们的身体可以由有体函数和无体函数组成,尽管有些函数出现在有体函数的身体中有特定的限制。

Non-bodied Functions 非有体函数

Non-bodied functions are used to do some action based on their parameters.
非有体函数是用来根据其参数做一些动作的。

Functions 函数

Match 匹配

Match (non-bodied) is one of the most essential functions in REXS. It represents a match of a literal string in a regular expression.
匹配(非有体函数)是REXS中最基本的功能之一。它表示正则表达式中字面字符串的匹配。

Type Allowed Values Descriptions Required
String
or Control
Strings or Match controls The value being matched
被匹配的值
Yes
Integer Integers An input value for specific controls
一个用于特定控制的输入值
No

Assert 断言

Assert (non-bodied) another important function in REXS, and is similar to a match, except it does not include the match result.
断言(非有体函数)是REXS中另一个重要的功能,与匹配相似,只是它不包括匹配结果。

Type Allowed Values Descriptions Required
Control Assert controls The value being asserted
被断言的值
Yes

Flag 标志

Flag (non-bodied) sets a flag in the regular expression.
Flag(非有体函数)在正则表达式中设置一个标志。

Type Allowed Values Descriptions Required
Control i
/g
/m
/s
/u
/y
The flag being set
被设置的标志
Yes

Group 组

Group (bodied) defines a capturing group around everything inside its body.
组(有体函数)在其体内的一切周围定义了一个捕捉组。

Backref

Backref (non-bodied) matches the value of a capturing group, by its index.
Backref(非有体函数)与捕获组的值相匹配,按其索引。

Type Allowed Values Descriptions Required
Integer Integers The index of the capturing group
捕获组的索引
Yes

Repeat 重复

Repeat (bodied) repeats its body a specified amount of times.
重复(有体函数)重复其身体的指定次数。

Type Allowed Values Descriptions Required
Integer Integers The minimum amount of repeats needed to match
匹配所需的最小重复量
Yes
Integer Integers or inf
/infinity
/forever
The maximum amount of repeats that can be matched
可以匹配的最大重复量
Yes
Control nongreedy If specified, makes the repeat non-greedy and repeats the least amount of times possible
如果指定,则使重复非贪婪,并尽可能地重复最少的次数。
No

Set 集合

Set (bodied) creates a character set and puts its body inside the set. Additionally, only the match and to functions are allowed inside its body.
集合(有体函数)创建一个字符集,并将其主体放在集合内。此外,在它的主体内只允许使用match和to函数。

To

To (non-bodied) is to be used inside of set and represents the to (-) character in a regex set.
To(非有体函数)是在set里面使用的,表示在一个regex集合中的to(-)字符。

Or

Or (bodied) creates an or-expression, matching anything inside its body. It will only match one group in its body. Additionally, only the orpart functions are allowed inside its body.
Or (有体函数) 创建一个or-expression,匹配其主体内的任何内容。它只能匹配其主体中的一个组。此外,它的主体内只允许使用orpart函数。

OrPart

OrPart (bodied) is to be used inside of or and creates a possible option for the or to use.
OrPart (有体函数)要在or里面使用,并为or创建一个可能的选项来使用。

Before

Before (bodied) is similar to assert in that it matches something but does not include the match. This function can be used to assert that its body matches (or doesn’t match) before the main expression.
Before (有体函数) 与 assert 相似,它匹配某样东西,但不包括匹配的内容。这个函数可以用来断言其body在主表达式之前匹配(或不匹配)。

Type Allowed Values Descriptions Required
Control not If used, will assert that the body doesn’t match instead of asserting that it does match
如果使用,将断言body不匹配,取代断言它确实匹配。
No

After

After (bodied) is similar to assert in that it matches something but does not include the match. This function can be used to assert that its body matches (or doesn’t match) after the main expression.
After (有体函数)与assert类似,它匹配某些东西,但不包括匹配的内容。这个函数可以用来断言其body在主表达式之后匹配(或不匹配)。

Type Allowed Values Descriptions Required
Control not If used, will assert that the body doesn’t match instead of asserting that it does match
如果使用,将断言body不匹配,取代断言它确实匹配。
No

Control Sets 控制组

Below is a list of the possible control values for specific functions:
下面是具体功能的可能控制值的列表。

Match Controls 匹配控件

  • ANY - Matches any character. 匹配任何字符
  • DIGIT - Matches any digit (0-9). 匹配任何数字(0-9)。
  • NON_DIGIT - Matches any non-digit (not 0-9). 匹配任何非数字(非0-9)。
  • ALPHANUM - Matches any alphanum (0-9 or a-z or A-Z). 匹配任何字母(0-9或a-z或A-Z)。
  • NON_ALPHANUM - Matches any non-alphanum (not 0-9 or a-z or A-Z). 匹配任何非字母(非0-9或a-z或A-Z)。
  • SPACE - Matches any space ( ). 匹配任何空格( )。
  • NON_SPACE - Matches any non-spcae (not ). 匹配任何非空格(非)。
  • HTAB - Matches any horizontal tab ( or \t). 匹配任何水平标签(或t)。
  • VTAB - Matches any vertical tab (\v). 匹配任何垂直标签(v)。
  • RETURN - Matches any return (\r). 匹配任何return(r)。
  • LINEFEED - Matches any linefeed (\n). 匹配任何换行符(n)。
  • FORMFEED - Matches any formfeed (\f). 匹配任何formfeed (f)。
  • BACKSPACE - Matches any backspace (\b). 匹配任何退格(b)。
  • NULL - Matches any null (\0). 匹配任何空(0)。
  • QUOTE - Matches any quote ("). 匹配任何双引号(”)。
  • *CONTROL - Matches a specific control character. The second parameter of match must be a value A-Z, specifying the control character that this matches. 匹配一个特定的控制字符。match的第二个参数必须是一个值A-Z,指定该匹配的控制字符。
  • *HEX - Matches a specific character by its hex code. The second parameter of match must be a hex string with a length of either 2 or 4, indicating the hex character being matched. 通过十六进制代码匹配一个特定的字符。match的第二个参数必须是一个长度为2或4的十六进制字符串,表示被匹配的十六进制字符。

*=requires second parameter to be used. 要求使用第二个参数。

Assert Controls 断言控件

  • START - Asserts the start of the string being matched on or a line (depending on the flags). 断言被匹配的字符串的开始,或断言一个行(取决于标志)。
  • END - Asserts the end of the string being matched on or a line (depending on the flags). 断言被匹配的字符串的结束,或断言一个行的结束(取决于标志)。
  • WORD_BOUNDARY - Asserts a word boundary (the character before or after the start or end of a word). 断言一个词的边界(一个词的开始或结束之前或之后的字符)。
  • NOT_WORD_BOUNDARY - Asserts not a word boundary (the character before or after the start or end of a word). 断言不是一个字的边界(一个字的开始或结束之前或之后的字符)。

Examples 例子

IRC

Let’s take a look at a basic IRC message:
让我们来看看一个基本的IRC消息。

  1. name!email PRIVMSG #channel :message

If we want to parse all of the details from this, we can build a regular expression in REXS like this:
如果我们想从中解析出所有的细节,我们可以在REXS中这样建立一个正则表达式。

  1. assert(START);
  2. group() {
  3. repeat(0, inf, nongreedy) {
  4. match(ANY);
  5. }
  6. }
  7. match("!");
  8. group() {
  9. repeat(0, inf, nongreedy) {
  10. match(ANY);
  11. }
  12. }
  13. match(" PRIVMSG #");
  1. group() {
  2. repeat(0, inf, nongreedy) {
  3. match(ANY);
  4. }
  5. }
  6. match(" :");
  1. group() {
  2. repeat(0, inf) {
  3. match(ANY);
  4. }
  5. }
  6. assert(END);

This code will then be compiled to:
然后,这段代码将被编译为。

  1. /^(.*?)!(.*?) PRIVMSG \#(.*?) :(.*)$/

External Resources 外部资源

Compiler and Decompiler
编译器和反编译器

最后我也举个例子

在使用tsc过程中,如果编译为ES modules,tsc编译后无法加上.js后缀,导致无法在浏览器script中使用module导入,社区没解决这个问题。

资料:
segmentfault.com/q/1010000038671707
社区讨论

所以只能自己写,在ts文件编译后,执行脚本在导入导出路径部分加上了js后缀,我使用正则写了个脚本:
https://github.com/liulinboyi/HTMLParser/blob/main/script/addSuffixJs.js

  1. const fs = require("fs/promises")
  2. const path = require("path")
  3. void async function () {
  4. try {
  5. let p = path.resolve(__dirname, "../dist-esmodule")
  6. let paths = await fs.readdir(p)
  7. // console.log(paths)
  8. let stack = [... paths]
  9. while (stack.length) {
  10. let top = stack.pop()
  11. let pat = path.resolve(p, top)
  12. let stat = await fs.stat(pat)
  13. if (stat.isDirectory()) {
  14. let temp = await fs.readdir(pat)
  15. if (temp) {
  16. for (let i of temp) {
  17. stack.push(path.join(top, i))
  18. }
  19. }
  20. } else {
  21. // console.log(pat)
  22. let personList = await fs.readFile(pat, {encoding: "utf8"})
  23. var regexpNames = /(?:export|import)(?:\s)*?(?:\{)??.*?(?:\})??(?:\s)*?from(?:\s)*?"(.+?)"/gm
  24. var match = personList.matchAll(regexpNames);
  25. let count = 0
  26. for (let item of match) {
  27. if (/.js$/.test(item[1])) {
  28. continue
  29. }
  30. let temp = item[0]
  31. let index = item.index + count
  32. let now = temp.replace(item[1], `${
  33. item[1]
  34. }.js`)
  35. let past = personList.slice(0, index)
  36. let feature = personList.slice(index + temp.length, personList.length)
  37. personList = `${past}${now}${feature}`
  38. count = count + 3
  39. }
  40. await fs.writeFile(pat, personList, {encoding: "utf8"})
  41. }
  42. }
  43. // for (let item of paths) {
  44. // let pat = path.resolve(p, item)
  45. // console.log(pat)
  46. // let stat = await fs.stat(pat)
  47. // console.log(stat.isDirectory())
  48. // }
  49. } catch (error) {
  50. console.log(error)
  51. }
  52. }()

其中正则就是使用REXS编写的:

  1. import { Compile } from "rexs"
  2. const expression = Compile(`
  3. or() {
  4. match("export");
  5. match("import");
  6. }
  7. repeat(0, inf, nongreedy) {
  8. match(SPACE);
  9. }
  10. repeat(0, 1, nongreedy) {
  11. match("{");
  12. }
  13. repeat(0, inf, nongreedy) {
  14. match(ANY);
  15. }
  16. repeat(0, 1, nongreedy) {
  17. match("}");
  18. }
  19. repeat(0, inf, nongreedy) {
  20. match(SPACE);
  21. }
  22. match("from");
  23. repeat(0, inf, nongreedy) {
  24. match(SPACE);
  25. }
  26. match('"');
  27. group() {
  28. repeat(1, inf, nongreedy) {
  29. match(ANY);
  30. }
  31. }
  32. match('"');
  33. flag(m)
  34. flag(g)
  35. `);
  36. console.log(expression)
  37. // /(?:export|import)(?:\s)*?(?:\{)??.*?(?:\})??(?:\s)*?from(?:\s)*?"(.+?)"/gm

最后编译结果就是:

  1. /(?:export|import)(?:\s)*?(?:\{)??.*?(?:\})??(?:\s)*?from(?:\s)*?"(.+?)"/gm