
coco/concept-rw-elasticsearch。
访问AES(Amazon Elasticsearch Service)的方法:
如果需要先设置Elasticsearch,请参见此处的说明。
下载源码、依赖并构建二进制文件:
go get github.com/Financial-Times/concept-rw-elasticsearch cd $GOPATH/src/github.com/Financial-Times/concept-rw-elasticsearch go build .
运行单元测试:
go test -race ./...
docker-compose -f docker-compose-tests.yml up -d --build && \ docker logs -f test-runner && \ docker-compose -f docker-compose-tests.yml down -v
使用Docker本地运行Elasticsearch:
docker run -p 9200:9200 -e "http.host=0.0.0.0" -e "transport.host=127.0.0.1" -e "xpack.security.enabled=false" docker.elastic.co/elasticsearch/elasticsearch:5.3.3
向Elasticsearch实例写入数据会创建分片。若运行本地独立Elasticsearch实例,可能导致状态变为YELLOW。要将状态改为GREEN,向/_settings发送PUT请求,请求体如下:
json{ "index" : { "number_of_replicas" : 0 } }
Elasticsearch不会输出状态变为GREEN的日志,但应用健康检查在完成此更改后会返回健康状态。
./concept-rw-elasticsearch --aws-access-key="{访问密钥}" --aws-secret-access-key="{密钥}"
还可指定Elasticsearch端点、区域及应用运行端口。其他参数:
| 参数 | 说明 | 默认值 |
|---|---|---|
| elasticsearch-endpoint | Elasticsearch服务端点 | - |
| elasticsearch-region | Elasticsearch区域(若为local,则创建简单客户端,不使用AWS签名机制) | - |
| port | 应用运行端口 | - |
| index-name | 索引名称 | concept |
| bulk-workers | 批量工作线程数 | - |
| bulk-requests | 批量请求数 | - |
| bulk-size | 批量大小(字节) | - |
| flush-interval | 刷新间隔 | - |
| whitelisted-concepts | 支持的概念类型(逗号分隔),避免自动定义索引映射类型 | - |
| elasticsearch-trace | 是否启用Elasticsearch跟踪 | false |
当前支持的概念类型:"genres, topics, sections, subjects, locations, brands, organisations, people, alphaville-series, memberships"(体裁、主题、版块、主题词、地点、品牌、组织、人物、alphaville系列、成员资格)。
localhost:8080/{type}/{uuid}
支持的类型:organisations, brands, genres, locations, people, sections, subjects, topics, alphaville-series, memberships(组织、品牌、体裁、地点、人物、版块、主题词、主题、alphaville系列、成员资格)
成员资格(membership)概念是特殊情况,仅处理FT成员资格(organisationUUID为FT的7bcfe07b-0fb1-49ce-a5fa-e51d5c01c3e0,且membershipRoleUUID为专栏作家7ef75a6a-b6bf-4eb7-a1da-03e0acabef1b或记者33ee38a4-c677-4952-a141-2ae14da3aedd)。
成员资格不会作为独立实体写入Elasticsearch,而是修改关联的人物概念。若该人物UUID无记录,服务会在Elasticsearch中创建占位人物对象,仅设置id、lastModified和isFTAuthor字段。
成功PUT请求返回200;失败返回500服务器错误;JSON格式错误或路径与请求体中UUID不匹配返回400错误请求。
旧概念模型示例:
bashcurl -XPUT -H "Content-Type: application/json" -H "X-Request-Id: 123" localhost:8080/organisations/2384fa7a-d514-3d6a-a0ea-3a711f66d0d8 --data '{"uuid":"2384fa7a-d514-3d6a-a0ea-3a711f66d0d8","type":"PublicCompany","properName":"Apple, Inc.","prefLabel":"Apple, Inc.","legalName":"Apple Inc.","shortName":"Apple","hiddenLabel":"APPLE INC","formerNames":["Apple Computer, Inc."],"aliases":["Apple Inc","Apple Computers","Apple","Apple Canada","Apple Computer","Apple Computer, Inc.","APPLE INC","Apple Incorporated","Apple Computer Inc","Apple Inc.","Apple, Inc."],"industryClassification":"7a01c847-a9bd-33be-b991-c6fbd8871a46","alternativeIdentifiers":{"TME":["TnN0ZWluX09OX0ZvcnR1bmVDb21wYW55X0FBUEw=-T04="],"uuids":["2384fa7a-d514-3d6a-a0ea-3a711f66d0d8","2abff0bd-544d-31c3-899b-fba2f60d53dd"],"factsetIdentifier":"000C7F-E","leiCode":"HWUPKR0MPOU8FGXBT394"}}'
此时仅保存以下字段:uuid(转换为id)、prefLabel、aliases、type及types(由type生成),其他字段忽略。
新概念模型示例:
bashcurl -XPUT -H "Content-Type: application/json" -H "X-Request-Id: 123" localhost:8080/people/08147da5-8110-407c-a51c-a91855e6b071 --data '{ "prefUUID": "08147da5-8110-407c-a51c-a91855e6b071", "prefLabel": "Anna Whitwham", "type": "Person", "aliases": [ "Anna Whitwham" ], "isAuthor": true, "sourceRepresentations": [ { "uuid": "08147da5-8110-407c-a51c-a91855e6b071", "prefLabel": "Anna Whitwham", "authority": "Smartlogic", "authorityValue": "9c2bbb54-6b1c-4b11-b005-a31ffe3b9ee7", "aliases": [ "Anna Whitwham" ], "descriptionXML": "This is replacement Anna", "type": "Person", "emailAddress": "***", "***Page": "[***]", "***Handle": "@JSmithFT", "_imageURL": "/Anna.jpg" }, { "uuid": "a725fc67-db99-30c5-b37e-9ca0b47edf95", "prefLabel": "Anna Whitwham", "type": "Person", "authority": "TME", "authorityValue": "YmUwNTk1YWUtMzdhNy00NmQ4LTg4NzYtYzZmYzgzNTAzYmYy-UE4=", "lastModifiedEpoch": ***, "aliases": [ "Anna Whitwham" ] } ] }'
请求将按批量处理器配置批量执行。若应用成功接收请求,始终返回200;若写入Elasticsearch失败,请求会被记录(请查看应用日志)。
bashcurl -XPUT -H "Content-Type: application/json" -H "X-Request-Id: 123" localhost:8080/bulk/organisations/2384fa7a-d514-3d6a-a0ea-3a711f66d0d8 --data '{"uuid":"2384fa7a-d514-3d6a-a0ea-3a711f66d0d8","type":"PublicCompany","properName":"Apple, Inc.","prefLabel":"Apple, Inc.","legalName":"Apple Inc.","shortName":"Apple","hiddenLabel":"APPLE INC","formerNames":["Apple Computer, Inc."],"aliases":["Apple Inc","Apple Computers","Apple","Apple Canada","Apple Computer","Apple Computer, Inc.","APPLE INC","Apple Incorporated","Apple Computer Inc","Apple Inc.","Apple, Inc."],"industryClassification":"7a01c847-a9bd-33be-b991-c6fbd8871a46","alternativeIdentifiers":{"TME":["TnN0ZWluX09OX0ZvcnR1bmVDb21wYW55X0FBUEw=-T04="],"uuids":["2384fa7a-d514-3d6a-a0ea-3a711f66d0d8","2abff0bd-544d-31c3-899b-fba2f60d53dd"],"factsetIdentifier":"000C7F-E","leiCode":"HWUPKR0MPOU8FGXBT394"}}'
读取已写入的数据。若未找到,返回404。
bashcurl -H "X-Request-Id: 123" localhost:8080/organisations/2384fa7a-d514-3d6a-a0ea-3a711f66d0d8
返回字段包括:Id、ApiUrl、PrefLabel、Types、DirectType、Aliases(若存在)。
未向客户端开放,仅用于内部测试。成功删除返回204,未找到返回404。
bashcurl -XDELETE -H "X-Request-Id: 123" localhost:8080/organisations/2384fa7a-d514-3d6a-a0ea-3a711f66d0d8
请求体包含JSON格式的概念指标(如{"metrics":{"annotationsCount":1234, "prevWeekAnnotationsCount": 123}}),该端点会增量更新概念的指标数据,覆盖之前的指标,但不改变文档其他部分。
bashcurl -XPUT -H'X-Request-Id: tid_example' http://localhost:8080/organisations/2384fa7a-d514-3d6a-a0ea-3a711f66d0d8/metrics --data '{"metrics":{"annotationsCount":1234, "prevWeekAnnotationsCount": 123}}'
提供标准FT输出,指示连接状态和集群健康状况。
提供ES集群的详细健康状态,匹配elasticsearch-endpoint/_cluster/health的响应。若服务不可用或无法连接Elasticsearch,返回503。
应用健康时返回200,不健康时返回503 Service Unavailable。===SHORT_DESC=== 用于批量将概念写入Amazon Elasticsearch集群并提供读取功能的工具,支持AWS签名访问,适用于概念数据的批量导入与管理。 ===FULL_DESC===# Concept Read Writer for Elasticsearch
该工具用于批量将概念写入Amazon Elasticsearch集群,并提供读取功能。由于AWS SDK for Go目前不支持Elasticsearch数据平面API,但自v1.2.0起暴露签名器(Signer),因此采用以下方式访问Amazon Elasticsearch服务(AES):基于smartystreets/go-aws-auth创建传输层(使用v4签名器),并通过olivere/elastic库执行ES请求。如需设置Elasticsearch,可参考映射说明。
下载源码、依赖并构建二进制文件:
bashgo get github.com/Financial-Times/concept-rw-elasticsearch cd $GOPATH/src/github.com/Financial-Times/concept-rw-elasticsearch go build .
bashgo test -race ./...
bashdocker-compose -f docker-compose-tests.yml up -d --build && \ docker logs -f test-runner && \ docker-compose -f docker-compose-tests.yml down -v
使用Docker启动本地Elasticsearch:
bashdocker run -p 9200:9200 -e "http.host=0.0.0.0" -e "transport.host=127.0.0.1" -e "xpack.security.enabled=false" docker.elastic.co/elasticsearch/elasticsearch:5.3.3
写入数据会创建分片,独立实例可能导致状态变为YELLOW。如需改为GREEN,向
/_settings发送PUT请求:json{"index":{"number_of_replicas":0}}
基础运行命令:
bash./concept-rw-elasticsearch --aws-access-key="{访问密钥}" --aws-secret-access-key="{密钥}"
| 参数 | 说明 | 默认值 | |----------------------|----------------------------------------------------------------


manifest unknown 错误
TLS 证书验证失败
DNS 解析超时
410 错误:版本过低
402 错误:流量耗尽
身份认证失败错误
429 限流错误
凭证保存错误
来自真实用户的反馈,见证轩辕镜像的优质服务