-
类别:Java
-
项目标题:业界首创分布式流图计算引擎
-
项目描述: GeaFlow(品牌名TuGraph-Analytics)是蚂蚁集团开源的分布式实时图计算引擎,目前广泛应用于金融风控、社交网络、知识图谱以及数据应用等场景。GeaFlow的核心能力是流式图计算, 流式图计算相比离线图计算提供了一种高时效性低延迟的图计算模式。相比传统的流式计算引擎比如Flink、Storm这些面向表数据的实时处理系统而言,GeaFlow 主要面向图数据的实时处理,支持更加复杂的关系分析计算,比如多度关系实时查找、环路检查等;同时也支持图表一体的实时分析处理,能同时处理表数据和图数据。关于GeaFlow使用场景更多介绍请参考:GeaFlow介绍文档。
-
亮点:图计算在业界使用越来越广泛,然后目前开源界还没有一款好用图计算引擎尤其是流图计算领域,可以说还是一片空白。蚂蚁图计算团队从金融风控等业务场景出发研发了处理万亿点边规模的实时图计算引擎GeaFlow,并正式对外开源。
-
示例代码: ` set geaflow.dsl.window.size = 1;
CREATE GRAPH IF NOT EXISTS dy_modern ( Vertex person ( id bigint ID, name varchar ), Edge knows ( srcId bigint SOURCE ID, targetId bigint DESTINATION ID, weight double ) ) WITH ( storeType='rocksdb', shardCount = 1 );
CREATE TABLE IF NOT EXISTS tbl_source (
text varchar
) WITH (
type='socket',
geaflow.dsl.column.separator
= '#',
geaflow.dsl.socket.host
= '${your.host.ip}',
geaflow.dsl.socket.port
= 9003
);
CREATE TABLE IF NOT EXISTS tbl_result (
a_id bigint,
b_id bigint,
c_id bigint,
d_id bigint,
a1_id bigint
) WITH (
type='socket',
geaflow.dsl.column.separator
= ',',
geaflow.dsl.socket.host
= '${your.host.ip}',
geaflow.dsl.socket.port
= 9003
);
USE GRAPH dy_modern;
INSERT INTO dy_modern.person(id, name) SELECT cast(split_ex(t1, ',', 0) as bigint), split_ex(t1, ',', 1) FROM ( Select trim(substr(text, 2)) as t1 FROM tbl_source WHERE substr(text, 1, 1) = '.' );
INSERT INTO dy_modern.knows SELECT cast(split_ex(t1, ',', 0) as bigint), cast(split_ex(t1, ',', 1) as bigint), cast(split_ex(t1, ',', 2) as double) FROM ( Select trim(substr(text, 2)) as t1 FROM tbl_source WHERE substr(text, 1, 1) = '-' );
INSERT INTO tbl_result SELECT a_id, b_id, c_id, d_id, a1_id FROM ( MATCH (a:person) -[:knows]->(b:person) -[:knows]-> (c:person) -[:knows]-> (d:person) -> (a:person) RETURN a.id as a_id, b.id as b_id, c.id as c_id, d.id as d_id, a.id as a1_id ); `
-
截图:(可选)gif/png/jpg
-
后续更新计划: 两个月一个大版本,每周固定迭代。