slug: /zh/sql-reference/dictionaries/external-dictionaries/external-dicts-dict-sources machine_translated: true machine_translated_rev: 72537a2d527c63c07aa5d2361a8829f3895cf2bd sidebar_position: 43

sidebar_label: “\u5916\u90E8\u5B57\u5178\u7684\u6765\u6E90”

外部字典的来源 {#dicts-external-dicts-dict-sources}

外部字典可以从许多不同的来源连接。

如果使用xml-file配置字典,则配置如下所示:

  1. <clickhouse>
  2. <dictionary>
  3. ...
  4. <source>
  5. <source_type>
  6. <!-- Source configuration -->
  7. </source_type>
  8. </source>
  9. ...
  10. </dictionary>
  11. ...
  12. </clickhouse>

在情况下 DDL-查询,相等的配置将看起来像:

  1. CREATE DICTIONARY dict_name (...)
  2. ...
  3. SOURCE(SOURCE_TYPE(param1 val1 ... paramN valN)) -- Source configuration
  4. ...

源配置在 source 科。

对于源类型 本地文件, 可执行文件, HTTP(s), ClickHouse 可选设置:

  1. <source>
  2. <file>
  3. <path>/opt/dictionaries/os.tsv</path>
  4. <format>TabSeparated</format>
  5. </file>
  6. <settings>
  7. <format_csv_allow_single_quotes>0</format_csv_allow_single_quotes>
  8. </settings>
  9. </source>

  1. SOURCE(FILE(path './user_files/os.tsv' format 'TabSeparated'))
  2. SETTINGS(format_csv_allow_single_quotes = 0)

来源类型 (source_type):

本地文件 {#dicts-external_dicts_dict_sources-local_file}

设置示例:

  1. <source>
  2. <file>
  3. <path>/opt/dictionaries/os.tsv</path>
  4. <format>TabSeparated</format>
  5. </file>
  6. </source>

  1. SOURCE(FILE(path './user_files/os.tsv' format 'TabSeparated'))

设置字段:

  • path – The absolute path to the file.
  • format – The file format. All the formats described in “格式” 支持。

可执行文件 {#dicts-external_dicts_dict_sources-executable}

使用可执行文件取决于 字典如何存储在内存中. 如果字典存储使用 cachecomplex_key_cache,ClickHouse通过向可执行文件的STDIN发送请求来请求必要的密钥。 否则,ClickHouse将启动可执行文件并将其输出视为字典数据。

设置示例:

  1. <source>
  2. <executable>
  3. <command>cat /opt/dictionaries/os.tsv</command>
  4. <format>TabSeparated</format>
  5. </executable>
  6. </source>

  1. SOURCE(EXECUTABLE(command 'cat /opt/dictionaries/os.tsv' format 'TabSeparated'))

设置字段:

  • command – The absolute path to the executable file, or the file name (if the program directory is written to PATH).
  • format – The file format. All the formats described in “格式” 支持。

Http(s) {#dicts-external_dicts_dict_sources-http}

使用HTTP(s)服务器取决于 字典如何存储在内存中. 如果字典存储使用 cachecomplex_key_cache,ClickHouse通过通过发送请求请求必要的密钥 POST 方法。

设置示例:

  1. <source>
  2. <http>
  3. <url>http://[::1]/os.tsv</url>
  4. <format>TabSeparated</format>
  5. <credentials>
  6. <user>user</user>
  7. <password>password</password>
  8. </credentials>
  9. <headers>
  10. <header>
  11. <name>API-KEY</name>
  12. <value>key</value>
  13. </header>
  14. </headers>
  15. </http>
  16. </source>

  1. SOURCE(HTTP(
  2. url 'http://[::1]/os.tsv'
  3. format 'TabSeparated'
  4. credentials(user 'user' password 'password')
  5. headers(header(name 'API-KEY' value 'key'))
  6. ))

为了让ClickHouse访问HTTPS资源,您必须 配置openSSL 在服务器配置中。

设置字段:

  • url – The source URL.
  • format – The file format. All the formats described in “格式” 支持。
  • credentials – Basic HTTP authentication. Optional parameter.
    • user – Username required for the authentication.
    • password – Password required for the authentication.
  • headers – All custom HTTP headers entries used for the HTTP request. Optional parameter.
    • header – Single HTTP header entry.
    • name – Identifiant name used for the header send on the request.
    • value – Value set for a specific identifiant name.

ODBC {#dicts-external_dicts_dict_sources-odbc}

您可以使用此方法连接具有ODBC驱动程序的任何数据库。

设置示例:

  1. <source>
  2. <odbc>
  3. <db>DatabaseName</db>
  4. <table>ShemaName.TableName</table>
  5. <connection_string>DSN=some_parameters</connection_string>
  6. <invalidate_query>SQL_QUERY</invalidate_query>
  7. </odbc>
  8. </source>

  1. SOURCE(ODBC(
  2. db 'DatabaseName'
  3. table 'SchemaName.TableName'
  4. connection_string 'DSN=some_parameters'
  5. invalidate_query 'SQL_QUERY'
  6. ))

设置字段:

  • db – Name of the database. Omit it if the database name is set in the <connection_string> 参数。
  • table – Name of the table and schema if exists.
  • connection_string – Connection string.
  • invalidate_query – Query for checking the dictionary status. Optional parameter. Read more in the section 更新字典.

ClickHouse接收来自ODBC-driver的引用符号,并将查询中的所有设置引用到driver,因此有必要根据数据库中的表名大小写设置表名。

如果您在使用Oracle时遇到编码问题,请参阅相应的 FAQ 文章.

ODBC字典功能的已知漏洞 {#known-vulnerability-of-the-odbc-dictionary-functionality}

  1. :::info "注意"
  2. 通过ODBC驱动程序连接参数连接到数据库时 `Servername` 可以取代。 在这种情况下,值 `USERNAME` `PASSWORD` `odbc.ini` 被发送到远程服务器,并且可能会受到损害。

不安全使用示例

让我们为PostgreSQL配置unixODBC。 的内容 /etc/odbc.ini:

  1. [gregtest]
  2. Driver = /usr/lib/psqlodbca.so
  3. Servername = localhost
  4. PORT = 5432
  5. DATABASE = test_db
  6. #OPTION = 3
  7. USERNAME = test
  8. PASSWORD = test

如果然后进行查询,例如

  1. SELECT * FROM odbc('DSN=gregtest;Servername=some-server.com', 'test_db');

ODBC驱动程序将发送的值 USERNAMEPASSWORDodbc.inisome-server.com.

连接Postgresql的示例 {#example-of-connecting-postgresql}

Ubuntu操作系统。

为PostgreSQL安装unixODBC和ODBC驱动程序:

  1. $ sudo apt-get install -y unixodbc odbcinst odbc-postgresql

配置 /etc/odbc.ini (或 ~/.odbc.ini):

  1. [DEFAULT]
  2. Driver = myconnection
  3. [myconnection]
  4. Description = PostgreSQL connection to my_db
  5. Driver = PostgreSQL Unicode
  6. Database = my_db
  7. Servername = 127.0.0.1
  8. UserName = username
  9. Password = password
  10. Port = 5432
  11. Protocol = 9.3
  12. ReadOnly = No
  13. RowVersioning = No
  14. ShowSystemTables = No
  15. ConnSettings =

ClickHouse中的字典配置:

  1. <clickhouse>
  2. <dictionary>
  3. <name>table_name</name>
  4. <source>
  5. <odbc>
  6. <!-- You can specify the following parameters in connection_string: -->
  7. <!-- DSN=myconnection;UID=username;PWD=password;HOST=127.0.0.1;PORT=5432;DATABASE=my_db -->
  8. <connection_string>DSN=myconnection</connection_string>
  9. <table>postgresql_table</table>
  10. </odbc>
  11. </source>
  12. <lifetime>
  13. <min>300</min>
  14. <max>360</max>
  15. </lifetime>
  16. <layout>
  17. <hashed/>
  18. </layout>
  19. <structure>
  20. <id>
  21. <name>id</name>
  22. </id>
  23. <attribute>
  24. <name>some_column</name>
  25. <type>UInt64</type>
  26. <null_value>0</null_value>
  27. </attribute>
  28. </structure>
  29. </dictionary>
  30. </clickhouse>

  1. CREATE DICTIONARY table_name (
  2. id UInt64,
  3. some_column UInt64 DEFAULT 0
  4. )
  5. PRIMARY KEY id
  6. SOURCE(ODBC(connection_string 'DSN=myconnection' table 'postgresql_table'))
  7. LAYOUT(HASHED())
  8. LIFETIME(MIN 300 MAX 360)

您可能需要编辑 odbc.ini 使用驱动程序指定库的完整路径 DRIVER=/usr/local/lib/psqlodbcw.so.

连接MS SQL Server的示例 {#example-of-connecting-ms-sql-server}

Ubuntu操作系统。

安装驱动程序: :

  1. $ sudo apt-get install tdsodbc freetds-bin sqsh

配置驱动程序:

  1. $ cat /etc/freetds/freetds.conf
  2. ...
  3. [MSSQL]
  4. host = 192.168.56.101
  5. port = 1433
  6. tds version = 7.0
  7. client charset = UTF-8
  8. $ cat /etc/odbcinst.ini
  9. ...
  10. [FreeTDS]
  11. Description = FreeTDS
  12. Driver = /usr/lib/x86_64-linux-gnu/odbc/libtdsodbc.so
  13. Setup = /usr/lib/x86_64-linux-gnu/odbc/libtdsS.so
  14. FileUsage = 1
  15. UsageCount = 5
  16. $ cat ~/.odbc.ini
  17. ...
  18. [MSSQL]
  19. Description = FreeTDS
  20. Driver = FreeTDS
  21. Servername = MSSQL
  22. Database = test
  23. UID = test
  24. PWD = test
  25. Port = 1433

在ClickHouse中配置字典:

  1. <clickhouse>
  2. <dictionary>
  3. <name>test</name>
  4. <source>
  5. <odbc>
  6. <table>dict</table>
  7. <connection_string>DSN=MSSQL;UID=test;PWD=test</connection_string>
  8. </odbc>
  9. </source>
  10. <lifetime>
  11. <min>300</min>
  12. <max>360</max>
  13. </lifetime>
  14. <layout>
  15. <flat />
  16. </layout>
  17. <structure>
  18. <id>
  19. <name>k</name>
  20. </id>
  21. <attribute>
  22. <name>s</name>
  23. <type>String</type>
  24. <null_value></null_value>
  25. </attribute>
  26. </structure>
  27. </dictionary>
  28. </clickhouse>

  1. CREATE DICTIONARY test (
  2. k UInt64,
  3. s String DEFAULT ''
  4. )
  5. PRIMARY KEY k
  6. SOURCE(ODBC(table 'dict' connection_string 'DSN=MSSQL;UID=test;PWD=test'))
  7. LAYOUT(FLAT())
  8. LIFETIME(MIN 300 MAX 360)

DBMS {#dbms}

Mysql {#dicts-external_dicts_dict_sources-mysql}

设置示例:

  1. <source>
  2. <mysql>
  3. <port>3306</port>
  4. <user>clickhouse</user>
  5. <password>qwerty</password>
  6. <replica>
  7. <host>example01-1</host>
  8. <priority>1</priority>
  9. </replica>
  10. <replica>
  11. <host>example01-2</host>
  12. <priority>1</priority>
  13. </replica>
  14. <db>db_name</db>
  15. <table>table_name</table>
  16. <where>id=10</where>
  17. <invalidate_query>SQL_QUERY</invalidate_query>
  18. </mysql>
  19. </source>

  1. SOURCE(MYSQL(
  2. port 3306
  3. user 'clickhouse'
  4. password 'qwerty'
  5. replica(host 'example01-1' priority 1)
  6. replica(host 'example01-2' priority 1)
  7. db 'db_name'
  8. table 'table_name'
  9. where 'id=10'
  10. invalidate_query 'SQL_QUERY'
  11. ))

设置字段:

  • port – The port on the MySQL server. You can specify it for all replicas, or for each one individually (inside <replica>).

  • user – Name of the MySQL user. You can specify it for all replicas, or for each one individually (inside <replica>).

  • password – Password of the MySQL user. You can specify it for all replicas, or for each one individually (inside <replica>).

  • replica – Section of replica configurations. There can be multiple sections.

    1. - `replica/host` The MySQL host.
    2. - `replica/priority` The replica priority. When attempting to connect, ClickHouse traverses the replicas in order of priority. The lower the number, the higher the priority.
  • db – Name of the database.

  • table – Name of the table.

  • where – The selection criteria. The syntax for conditions is the same as for WHERE 例如,mysql中的子句, id > 10 AND id < 20. 可选参数。

  • invalidate_query – Query for checking the dictionary status. Optional parameter. Read more in the section 更新字典.

MySQL可以通过套接字在本地主机上连接。 要做到这一点,设置 hostsocket.

设置示例:

  1. <source>
  2. <mysql>
  3. <host>localhost</host>
  4. <socket>/path/to/socket/file.sock</socket>
  5. <user>clickhouse</user>
  6. <password>qwerty</password>
  7. <db>db_name</db>
  8. <table>table_name</table>
  9. <where>id=10</where>
  10. <invalidate_query>SQL_QUERY</invalidate_query>
  11. </mysql>
  12. </source>

  1. SOURCE(MYSQL(
  2. host 'localhost'
  3. socket '/path/to/socket/file.sock'
  4. user 'clickhouse'
  5. password 'qwerty'
  6. db 'db_name'
  7. table 'table_name'
  8. where 'id=10'
  9. invalidate_query 'SQL_QUERY'
  10. ))

ClickHouse {#dicts-external_dicts_dict_sources-clickhouse}

设置示例:

  1. <source>
  2. <clickhouse>
  3. <host>example01-01-1</host>
  4. <port>9000</port>
  5. <user>default</user>
  6. <password></password>
  7. <db>default</db>
  8. <table>ids</table>
  9. <where>id=10</where>
  10. </clickhouse>
  11. </source>

  1. SOURCE(CLICKHOUSE(
  2. host 'example01-01-1'
  3. port 9000
  4. user 'default'
  5. password ''
  6. db 'default'
  7. table 'ids'
  8. where 'id=10'
  9. ))

设置字段:

  • host – The ClickHouse host. If it is a local host, the query is processed without any network activity. To improve fault tolerance, you can create a 分布 表并在后续配置中输入它。
  • port – The port on the ClickHouse server.
  • user – Name of the ClickHouse user.
  • password – Password of the ClickHouse user.
  • db – Name of the database.
  • table – Name of the table.
  • where – The selection criteria. May be omitted.
  • invalidate_query – Query for checking the dictionary status. Optional parameter. Read more in the section 更新字典.

Mongodb {#dicts-external_dicts_dict_sources-mongodb}

设置示例:

  1. <source>
  2. <mongodb>
  3. <host>localhost</host>
  4. <port>27017</port>
  5. <user></user>
  6. <password></password>
  7. <db>test</db>
  8. <collection>dictionary_source</collection>
  9. </mongodb>
  10. </source>

  1. SOURCE(MONGO(
  2. host 'localhost'
  3. port 27017
  4. user ''
  5. password ''
  6. db 'test'
  7. collection 'dictionary_source'
  8. ))

设置字段:

  • host – The MongoDB host.
  • port – The port on the MongoDB server.
  • user – Name of the MongoDB user.
  • password – Password of the MongoDB user.
  • db – Name of the database.
  • collection – Name of the collection.

Redis {#dicts-external_dicts_dict_sources-redis}

设置示例:

  1. <source>
  2. <redis>
  3. <host>localhost</host>
  4. <port>6379</port>
  5. <storage_type>simple</storage_type>
  6. <db_index>0</db_index>
  7. </redis>
  8. </source>

  1. SOURCE(REDIS(
  2. host 'localhost'
  3. port 6379
  4. storage_type 'simple'
  5. db_index 0
  6. ))

设置字段:

  • host – The Redis host.
  • port – The port on the Redis server.
  • storage_type – The structure of internal Redis storage using for work with keys. simple 适用于简单源和散列单键源, hash_map 用于具有两个键的散列源。 不支持具有复杂键的范围源和缓存源。 可以省略,默认值为 simple.
  • db_index – The specific numeric index of Redis logical database. May be omitted, default value is 0.

原始文章