openLooKeng documentation

Mon, Jan 1, 0001 审计日志 openLooKeng审计日志记录功能是一个自定义事件监听器，监听openLooKeng集群启停与集群中节点的动态添加与删除事件；监听WebUi用户登录与退出事件；监听查询事件,在查询创建和完成（成功或失败）时调用。审计日志包含以下信息：事件发生时间用户ID 访问发起方地址或标识事件类型（操作）访问资源名称事件结果在openLooKeng集群中，一次只能有一个事件侦听器插件处于活动状态。实现审计日志记录是HetuListener插件中io.prestosql.spi.eventlistener.EventListener的一个实现。覆盖的方法包括AuditEventLogger#onQueryCreatedEvent和AuditEventLogger#onQueryCompletedEvent。配置要启用审计日志记录功能，etc/event-listener.properties中必须存在以下配置来激活此功能。 hetu.event.listener.type=AUDIT hetu.event.listener.listen.query.creation=true hetu.event.listener.listen.query.completion=true hetu.auditlog.logoutput=/var/log/ hetu.auditlog.logconversionpattern=yyyy-MM-dd.HH 其他审计日志记录属性包括： hetu.event.listener.type：用于定义审计日志的记录类型，允许的值为AUDIT和LOGGER。 hetu.auditlog.logoutput：用于定义审计文件的绝对目录路径。确保运行openLooKeng服务器的进程对该目录有写权限。 hetu.auditlog.logconversionpattern：用于定义审计日志的轮转模式。允许的值为yyyy-MM-dd.HH和yyyy-MM-dd。配置文件示例： event-listener.name=hetu-listener hetu.event.listener.type=AUDIT hetu.event.listener.listen.query.creation=true hetu.event.listener.listen.query.completion=true hetu.event.listener.audit.file=/var/log/hetu/hetu-audit.log hetu.event.listener.audit.filecount=1 hetu.event.listener.audit.limit=100000 hetu.auditlog.logoutput=/var/log/ hetu.auditlog.logconversionpattern=yyyy-MM-dd.HH

Mon, Jan 1, 0001 分布式排序分布式排序允许对超过query.max-memory-per-node的数据排序。分布式排序通过协调节点的etc/config.properties中设置的distributed_sort会话属性或distributed-sort配置属性集启用。分布式排序默认启用。当启用分布式排序时，排序算子在集群中的多个节点上并行执行。每个openLooKeng工作节点的已部分排序数据随后被流式传输到单个工作节点以进行最终合并。该技术允许利用多个openLooKeng工作节点的内存进行排序。分布式排序的主要目的是允许对通常不适合单节点内存的数据集进行排序。可以预期性能将得到提升，但是这种提升不会随着节点数的增多而线性增长，因为数据需要由单个节点合并。

Mon, Jan 1, 0001 动态目录本节介绍openLooKeng的动态目录特性。通常openLooKeng管理员通过将目录概要文件（例如hive.properties）放置在连接节点目录（etc/catalog）下来将数据源添加到引擎。每当需要添加、更新或删除目录时，都需要重启所有协调节点和工作节点。为了动态修改目录，openLooKeng引入了动态目录的特性。动态目录的原理是，将目录相关的配置文件在一个共享文件系统上管理，然后所有协调节点和工作节点从共享文件系统上同步到本地，并加载。开启此特性需要：首先，在etc/config.properties中配置： catalog.dynamic-enabled=true 其次，在hdfs-config-default.properties中配置用于存储动态目录信息的文件系统。你可以通过etc/node.properties中的catalog.share.filesystem.profile属性修改这个文件名，默认为hdfs-config-default，你可以查看文件系统文档以获取更多信息。在etc/filesystem/路径下添加hdfs-config-default.properties文件，如果这个路径不存在，请创建。 fs.client.type=hdfs hdfs.config.resources=/opt/openlookeng/config/core-site.xml, /opt/openlookeng/config/hdfs-site.xml hdfs.authentication.type=NONE fs.hdfs.impl.disable.cache=true 如果HDFS开启Kerberos认证，那么 fs.client.type=hdfs hdfs.config.resources=/opt/openlookeng/config/core-site.xml, /opt/openlookeng/config/hdfs-site.xml hdfs.authentication.type=KERBEROS hdfs.krb5.conf.path=/opt/openlookeng/config/krb5.conf hdfs.krb5.keytab.path=/opt/openlookeng/config/user.keytab hdfs.krb5.principal=openlookeng@HADOOP.COM # replace openlookeng@HADOOP.COM to your principal fs.hdfs.impl.disable.cache=true 最后，在etc/node.properties配置用户文件系统中的存储动态目录信息的路径，用于指定共享文件系统上与本地存放目录相关的配置文件的路径；同时因为需要从共享文件系统上的相同路径同步配置文件，所以所有协调节点和工作节点的共享文件系统上的路径必须一致，本地的存放路径不做要求。 catalog.config-dir=/opt/openlookeng/catalog catalog.share.config-dir=/opt/openkeng/catalog/share 使用目录操作是通过openLooKeng协调节点上的RESTful API来完成的。HTTP请求具有如下形态（以Hive连接节点为例），POST/PUT请求体形式为multipart/form-data： curl --location --request POST 'http://your_coordinator_ip:9101/v1/catalog' \ --header 'X-Presto-User: admin' \ --form 'catalogInformation="{ \"catalogName\" : \"hive\", \"connectorName\" : \"hive-hadoop2\", \"properties\" : { \"hive.hdfs.impersonation.enabled\" : \"false\", \"hive.hdfs.authentication.type\" : \"KERBEROS\", \"hive.collect-column-statistics-on-write\" : \"true\", \"hive.metastore.service.principal\" : \"hive/hadoop.hadoop.com@HADOOP.COM\", \"hive.metastore.authentication.type\" : \"KERBEROS\", \"hive.

版本 : 1.10.0