最初知道Apache Arrow Gandiva是无意间看Arrow项目的时候看到的,冲着项目主页上的LLVM,JIT的字样,我还实际尝试在Ubuntu安装和运行了下,但最后因为实在想不清楚,在什么场景下能用上,就弃坑了😂

直到前几天,我读完NoisePage的论文和部分源码,总感觉Arrow和LLVM的结合在哪里见到过——就是Apache Arrow Gandiva,那干脆这回一并把源码看了,搞清楚这东西到底是什么

相关资料

如果现在在Bing上搜“Apache Arrow Gandiva”,那么第二篇就会是一位知乎老哥写的Apache Arrow Gandiva:远大理想与尴尬现实,这也是当时我弃坑的主要原因。但今天我想说的是:为什么要用Java去处理Arrow数据?😂如果我是用Rust/C++,那Gandiva就一点都不尴尬,相反还很有意思——Apche Arrow Gandiva做了很多打通LLVM和Arrow生态的工作,给研究学者留下了很多探索空间

中文文档

Gandiva表达式、投影器和过滤器

Gandiva 外部函数开发指南

Dremio提供的资料

Introducing the Gandiva Initiative for Apache Arrow

Adding a User Defined Function to Gandiva

Gandiva Initiative: Improving SQL Performance by 70x

大家写的Blog

湖仓一体 - Apache Arrow的那些事

项目历史&现状简述

该项目由Dremio在2018年捐给Apache Arrow,现作为Apache Arrow的子项目之一(信息来源:Gandiva: A LLVM-based Analytical Expression Compiler for Apache Arrow)如果你再进一步深究的话,会发现Arrow当中有不少人现在就在Dremio中工作,而Dremio项目也使用Apache Arrow,而Gandiva则宣称为Dremio执行引擎的一部分

Gandiva最大的亮点是使用LLVM的自动向量化完成Arrow的向量化处理,而在LLVM部分当中,还实现了Project和Filter——这里如果加上Join和Aggregation操作,很多SQL操作就齐活了,如果你再把NoisePage算上的话,甚至能完成整套纯LLVM的Arrow CURD处理机制

虽然网传这个项目烂尾(根本就没这回事好吧😅),但事实是Gandiva一直都有commit进行维护,今年LLVM20出来以后也很快做了跟进

image-20250624221759873

目前Gandiva有C和C++的相关库,但对于Rust版本的Arrow似乎就不提供相关支持了:Interfaces for gandiva bindings.

源码解析

代码下载于2025.6.24,所有代码均平铺在单层目录上

Gandiva源码的地址:https://github.com/apache/arrow/tree/main/cpp/src/gandiva

|-- CMakeLists.txt

|-- GandivaConfig.cmake.in

|-- annotator.cc

|-- annotator.h

|-- annotator_test.cc

|-- arrow.h

|-- basic_decimal_scalar.h

|-- bitmap_accumulator.cc

|-- bitmap_accumulator.h

|-- bitmap_accumulator_test.cc

|-- cache.cc

|-- cache.h

|-- cache_test.cc

|-- cast_time.cc

|-- compiled_expr.h

|-- condition.h

|-- configuration.cc

|-- configuration.h

|-- context_helper.cc

|-- date_utils.cc

|-- date_utils.h

|-- decimal_ir.cc

|-- decimal_ir.h

|-- decimal_scalar.h

|-- decimal_type_util.cc

|-- decimal_type_util.h

|-- decimal_type_util_test.cc

|-- decimal_xlarge.cc

|-- decimal_xlarge.h

|-- dex.h

|-- dex_visitor.h

|-- encrypt_utils.cc

|-- encrypt_utils.h

|-- encrypt_utils_test.cc

|-- engine.cc

|-- engine.h

|-- engine_llvm_test.cc

|-- eval_batch.h

|-- execution_context.h

|-- exported_funcs.cc

|-- exported_funcs.h

|-- exported_funcs_registry.cc

|-- exported_funcs_registry.h

|-- exported_funcs_registry_test.cc

|-- expr_decomposer.cc

|-- expr_decomposer.h

|-- expr_decomposer_test.cc

|-- expr_validator.cc

|-- expr_validator.h

|-- expression.cc

|-- expression.h

|-- expression_cache_key.h

|-- expression_registry.cc

|-- expression_registry.h

|-- expression_registry_test.cc

|-- external_c_functions.cc

|-- field_descriptor.h

|-- filter.cc

|-- filter.h

|-- formatting_utils.h

|-- func_descriptor.h

|-- function_holder.h

|-- function_holder_maker_registry.cc

|-- function_holder_maker_registry.h

|-- function_ir_builder.cc

|-- function_ir_builder.h

|-- function_registry.cc

|-- function_registry.h

|-- function_registry_arithmetic.cc

|-- function_registry_arithmetic.h

|-- function_registry_common.h

|-- function_registry_datetime.cc

|-- function_registry_datetime.h

|-- function_registry_hash.cc

|-- function_registry_hash.h

|-- function_registry_math_ops.cc

|-- function_registry_math_ops.h

|-- function_registry_string.cc

|-- function_registry_string.h

|-- function_registry_test.cc

|-- function_registry_timestamp_arithmetic.cc

|-- function_registry_timestamp_arithmetic.h

|-- function_signature.cc

|-- function_signature.h

|-- function_signature_test.cc

|-- gandiva.pc.in

|-- gandiva_aliases.h

|-- gandiva_object_cache.cc

|-- gandiva_object_cache.h

|-- gdv_function_stubs.cc

|-- gdv_function_stubs.h

|-- gdv_function_stubs_test.cc

|-- gdv_hash_function_stubs.cc

|-- gdv_string_function_stubs.cc

|-- hash_utils.cc

|-- hash_utils.h

|-- hash_utils_test.cc

|-- in_holder.h

|-- interval_holder.cc

|-- interval_holder.h

|-- interval_holder_test.cc

|-- literal_holder.cc

|-- literal_holder.h

|-- llvm_generator.cc

|-- llvm_generator.h

|-- llvm_generator_test.cc

|-- llvm_includes.h

|-- llvm_types.cc

|-- llvm_types.h

|-- llvm_types_test.cc

|-- local_bitmaps_holder.h

|-- lru_cache.h

|-- lru_cache_test.cc

|-- lvalue.h

|-- make_precompiled_bitcode.py

|-- native_function.h

|-- node.h

|-- node_visitor.h

|-- precompiled

| |-- CMakeLists.txt

| |-- arithmetic_ops.cc

| |-- arithmetic_ops_test.cc

| |-- bitmap.cc

| |-- bitmap_test.cc

| |-- decimal_ops.cc

| |-- decimal_ops.h

| |-- decimal_ops_test.cc

| |-- decimal_wrapper.cc

| |-- epoch_time_point.h

| |-- epoch_time_point_test.cc

| |-- extended_math_ops.cc

| |-- extended_math_ops_test.cc

| |-- hash.cc

| |-- hash_test.cc

| |-- print.cc

| |-- string_ops.cc

| |-- string_ops_test.cc

| |-- testing.h

| |-- time.cc

| |-- time_constants.h

| |-- time_fields.h

| |-- time_test.cc

| |-- timestamp_arithmetic.cc

| `-- types.h

|-- precompiled_bitcode.cc.in

|-- projector.cc

|-- projector.h

|-- random_generator_holder.cc

|-- random_generator_holder.h

|-- random_generator_holder_test.cc

|-- regex_functions_holder.cc

|-- regex_functions_holder.h

|-- regex_functions_holder_test.cc

|-- regex_util.cc

|-- regex_util.h

|-- selection_vector.cc

|-- selection_vector.h

|-- selection_vector_impl.h

|-- selection_vector_test.cc

|-- simple_arena.h

|-- simple_arena_test.cc

|-- symbols.map

|-- tests

| |-- CMakeLists.txt

| |-- binary_test.cc

| |-- boolean_expr_test.cc

| |-- date_time_test.cc

| |-- decimal_single_test.cc

| |-- decimal_test.cc

| |-- external_functions

| | |-- CMakeLists.txt

| | |-- multiply_by_two.cc

| | `-- multiply_by_two.h

| |-- filter_project_test.cc

| |-- filter_test.cc

| |-- generate_data.h

| |-- hash_test.cc

| |-- huge_table_test.cc

| |-- if_expr_test.cc

| |-- in_expr_test.cc

| |-- literal_test.cc

| |-- micro_benchmarks.cc

| |-- null_validity_test.cc

| |-- projector_build_validation_test.cc

| |-- projector_test.cc

| |-- test_util.cc

| |-- test_util.h

| |-- timed_evaluate.h

| |-- to_string_test.cc

| `-- utf8_test.cc

|-- to_date_holder.cc

|-- to_date_holder.h

|-- to_date_holder_test.cc

|-- tree_expr_builder.cc

|-- tree_expr_builder.h

|-- tree_expr_test.cc

|-- value_validity_pair.h

`-- visibility.h

由于代码量极大,只选取部分进行分析

node

关于Tree的Node的定义

namespace gandiva {

class FieldNode;

class FunctionNode;

class IfNode;

class LiteralNode;

class BooleanNode;

template <typename Type>

class InExpressionNode;

/// \brief Visitor for nodes in the expression tree.

class GANDIVA_EXPORT NodeVisitor {

public:

virtual ~NodeVisitor() = default;

virtual Status Visit(const FieldNode& node) = 0;

virtual Status Visit(const FunctionNode& node) = 0;

virtual Status Visit(const IfNode& node) = 0;

virtual Status Visit(const LiteralNode& node) = 0;

virtual Status Visit(const BooleanNode& node) = 0;

virtual Status Visit(const InExpressionNode<int32_t>& node) = 0;

virtual Status Visit(const InExpressionNode<int64_t>& node) = 0;

virtual Status Visit(const InExpressionNode<float>& node) = 0;

virtual Status Visit(const InExpressionNode<double>& node) = 0;

virtual Status Visit(const InExpressionNode<gandiva::DecimalScalar128>& node) = 0;

virtual Status Visit(const InExpressionNode<std::string>& node) = 0;

};

} // namespace gandiva

tree_expr

tree_expr_test.cc

tree_expr_builder.cc

tree_expr_builder.h

用于解析计算树,比如4*5+3这种,通过TreeExprBuilder完成树的构建

TEST_F(TestExprTree, TestField) {

Annotator annotator;

auto n0 = TreeExprBuilder::MakeField(i0_);

EXPECT_EQ(n0->return_type(), int32());

auto n1 = TreeExprBuilder::MakeField(b0_);

EXPECT_EQ(n1->return_type(), boolean());

ExprDecomposer decomposer(*registry_, annotator);

ValueValidityPairPtr pair;

auto status = decomposer.Decompose(*n1, &pair);

DCHECK_EQ(status.ok(), true) << status.message();

auto value = pair->value_expr();

auto value_dex = std::dynamic_pointer_cast<VectorReadFixedLenValueDex>(value);

EXPECT_EQ(value_dex->FieldType(), boolean());

EXPECT_EQ(pair->validity_exprs().size(), 1);

auto validity = pair->validity_exprs().at(0);

auto validity_dex = std::dynamic_pointer_cast<VectorReadValidityDex>(validity);

EXPECT_NE(validity_dex->ValidityIdx(), value_dex->DataIdx());

}

借助函数重载,使用访问者模式,实现树的遍历与转换

class GANDIVA_EXPORT TreeExprBuilder {

public:

/// \brief create a node on a literal.

static NodePtr MakeLiteral(bool value);

static NodePtr MakeLiteral(uint8_t value);

static NodePtr MakeLiteral(uint16_t value);

static NodePtr MakeLiteral(uint32_t value);

static NodePtr MakeLiteral(uint64_t value);

static NodePtr MakeLiteral(int8_t value);

static NodePtr MakeLiteral(int16_t value);

static NodePtr MakeLiteral(int32_t value);

static NodePtr MakeLiteral(int64_t value);

static NodePtr MakeLiteral(float value);

static NodePtr MakeLiteral(double value);

static NodePtr MakeStringLiteral(const std::string& value);

static NodePtr MakeBinaryLiteral(const std::string& value);

static NodePtr MakeDecimalLiteral(const DecimalScalar128& value);

to_date_holder

完成字符串往时间的转化

EST_F(TestToDateHolder, TestSimpleDateTime) {

EXPECT_OK_AND_ASSIGN(auto to_date_holder, ToDateHolder::Make("YYYY-MM-DD HH:MI:SS", 1));

auto& to_date = *to_date_holder;

bool out_valid;

std::string s("1986-12-01 01:01:01");

int64_t millis_since_epoch =

to_date(&execution_context_, s.data(), (int)s.length(), true, &out_valid);

EXPECT_EQ(millis_since_epoch, 533779200000);

s = std::string("1986-12-01 01:01:01.11");

millis_since_epoch =

to_date(&execution_context_, s.data(), (int)s.length(), true, &out_valid);

EXPECT_EQ(millis_since_epoch, 533779200000);

s = std::string("1986-12-01 01:01:01 +0800");

millis_since_epoch =

to_date(&execution_context_, s.data(), (int)s.length(), true, &out_valid);

EXPECT_EQ(millis_since_epoch, 533779200000);

#if 0

// TODO : this fails parsing with date::parse and strptime on linux

s = std::string("1886-12-01 00:00:00");

millis_since_epoch =

to_date(&execution_context_, s.data(), (int) s.length(), true, &out_valid);

EXPECT_EQ(out_valid, true);

EXPECT_EQ(millis_since_epoch, -2621894400000);

#endif

s = std::string("1886-12-01 01:01:01");

millis_since_epoch =

to_date(&execution_context_, s.data(), (int)s.length(), true, &out_valid);

EXPECT_EQ(millis_since_epoch, -2621894400000);

s = std::string("1986-12-11 01:30:00");

millis_since_epoch =

to_date(&execution_context_, s.data(), (int)s.length(), true, &out_valid);

EXPECT_EQ(millis_since_epoch, 534643200000);

}

simple_arena

没太理解内容,似乎是关于内存分配处理的内容,实现以Trunk为单位的内存分配

TEST_F(TestSimpleArena, TestAlloc) {

int64_t chunk_size = 4096;

SimpleArena arena(arrow::default_memory_pool(), chunk_size);

// Small allocations should come from the same chunk.

int64_t small_size = 100;

for (int64_t i = 0; i < 20; ++i) {

auto p = arena.Allocate(small_size);

EXPECT_NE(p, nullptr);

EXPECT_EQ(arena.total_bytes(), chunk_size);

EXPECT_EQ(arena.avail_bytes(), chunk_size - (i + 1) * small_size);

}

// large allocations require separate chunks

int64_t large_size = 100 * chunk_size;

auto p = arena.Allocate(large_size);

EXPECT_NE(p, nullptr);

EXPECT_EQ(arena.total_bytes(), chunk_size + large_size);

EXPECT_EQ(arena.avail_bytes(), 0);

}

selection_vector

实现对于Arrow格式存储的选择向量(Selection Vector)

这里需要补充下关于选择向量的相关知识

Selection Vector 是一种在数据处理系统中使用的技术,用来表示一批数据中哪些行被选中(有效),从而避免对不相关的数据行进行操作。它常见于列式数据库、矢量化执行引擎(如 Apache Arrow、Dremio、Gandiva)中,用于提升性能。

Selection Vector(选择向量)本质上是一个索引数组,存储的是被选中行在原始数据批中的下标。

避免复制数据:只需操作向量而不移动原始数据。

高效过滤:可以快速跳过不符合条件的行。

矢量化执行支持:配合批处理(batch processing),提升 SIMD 性能。

落到具体选择上,可能就是bitmap或是个Set

TEST_F(TestSelectionVector, TestInt16Set) {

int max_slots = 10;

std::shared_ptr<SelectionVector> selection;

auto status = SelectionVector::MakeInt16(max_slots, pool_, &selection);

EXPECT_EQ(status.ok(), true) << status.message();

selection->SetIndex(0, 100);

EXPECT_EQ(selection->GetIndex(0), 100);

selection->SetIndex(1, 200);

EXPECT_EQ(selection->GetIndex(1), 200);

selection->SetNumSlots(2);

EXPECT_EQ(selection->GetNumSlots(), 2);

// TopArray() should return an array with 100,200

auto array_raw = selection->ToArray();

const auto& array = dynamic_cast<const arrow::UInt16Array&>(*array_raw);

EXPECT_EQ(array.length(), 2) << array_raw->ToString();

EXPECT_EQ(array.Value(0), 100) << array_raw->ToString();

EXPECT_EQ(array.Value(1), 200) << array_raw->ToString();

}

也可以通过Bitmap实现向量选择

TEST_F(TestSelectionVector, TestInt64PopulateFromBitMap) {

int max_slots = 200;

std::shared_ptr<SelectionVector> selection;

auto status = SelectionVector::MakeInt64(max_slots, pool_, &selection);

EXPECT_EQ(status.ok(), true) << status.message();

int bitmap_size = RoundUpNumi64(max_slots) * 8;

std::vector<uint8_t> bitmap(bitmap_size);

arrow::bit_util::SetBit(&bitmap[0], 0);

arrow::bit_util::SetBit(&bitmap[0], 5);

arrow::bit_util::SetBit(&bitmap[0], 121);

arrow::bit_util::SetBit(&bitmap[0], 220);

status = selection->PopulateFromBitMap(&bitmap[0], bitmap_size, max_slots - 1);

EXPECT_EQ(status.ok(), true) << status.message();

EXPECT_EQ(selection->GetNumSlots(), 3);

EXPECT_EQ(selection->GetIndex(0), 0);

EXPECT_EQ(selection->GetIndex(1), 5);

EXPECT_EQ(selection->GetIndex(2), 121);

}

regex_functions/util

正则表达式相关,似乎能检测SQL相关的符号,这部分使用了Google的re2库,参考PCRE(Perl Compatible Regular Expressions)实现标准

const std::set<char> RegexUtil::pcre_regex_specials_ = {

'[', ']', '(', ')', '|', '^', '-', '+', '*', '?', '{', '}', '$', '\\', '.'};

而测试也基本围绕些简易字符串展开

你甚至能看到关于中文字符的检测,这可太稀罕了,C++的UTF-8识别这块我一直摸不着头脑😂

input_string = "路%c$大";

extract_index = 2; // Retrieve all matched string

ret = extract_numbers(&execution_context_, input_string.c_str(),

static_cast<int32_t>(input_string.length()), extract_index,

&out_length);

ret_as_str = std::string(ret, out_length);

EXPECT_EQ(out_length, 1);

EXPECT_EQ(ret_as_str, "c");

random_generator

随机数生成器,里面包含了随机种子信息

namespace gandiva {

/// Function Holder for 'random'

class GANDIVA_EXPORT RandomGeneratorHolder : public FunctionHolder {

public:

~RandomGeneratorHolder() override = default;

static Result<std::shared_ptr<RandomGeneratorHolder>> Make(const FunctionNode& node);

double operator()() { return distribution_(generator_); }

private:

explicit RandomGeneratorHolder(int seed) : distribution_(0, 1) {

int64_t seed64 = static_cast<int64_t>(seed);

seed64 = (seed64 ^ 0x00000005DEECE66D) & 0x0000ffffffffffff;

generator_.seed(static_cast<uint64_t>(seed64));

}

RandomGeneratorHolder() : distribution_(0, 1) {

generator_.seed(::arrow::internal::GetRandomSeed());

}

std::mt19937_64 generator_;

std::uniform_real_distribution<> distribution_;

};

} // namespace gandiva

project

关于Gandiva如何处理Apache Arrow的Project的代码了,

/// \brief projection using expressions.

///

/// A projector is built for a specific schema and vector of expressions.

/// Once the projector is built, it can be used to evaluate many row batches.

看以看到实现中LLVM Generator,output_fields,是否使用已有的缓存,以及代码生成设置相关属性

std::unique_ptr<LLVMGenerator> llvm_generator_;

SchemaPtr schema_;

FieldVector output_fields_;

std::shared_ptr<Configuration> configuration_;

bool built_from_cache_;

};

这里面还涉及了关于数据缓冲区的代码

Status Projector::AllocArrayData(const DataTypePtr& type, int64_t num_records,

arrow::MemoryPool* pool,

ArrayDataPtr* array_data) const {

arrow::Status astatus;

std::vector<std::shared_ptr<arrow::Buffer>> buffers;

// The output vector always has a null bitmap.

int64_t size = arrow::bit_util::BytesForBits(num_records);

ARROW_ASSIGN_OR_RAISE(auto bitmap_buffer, arrow::AllocateBuffer(size, pool));

buffers.push_back(std::move(bitmap_buffer));

// String/Binary vectors have an offsets array.

auto type_id = type->id();

if (arrow::is_binary_like(type_id)) {

auto offsets_len = arrow::bit_util::BytesForBits((num_records + 1) * 32);

ARROW_ASSIGN_OR_RAISE(auto offsets_buffer, arrow::AllocateBuffer(offsets_len, pool));

buffers.push_back(std::move(offsets_buffer));

}

// The output vector always has a data array.

int64_t data_len;

if (arrow::is_primitive(type_id) || type_id == arrow::Type::DECIMAL) {

const auto& fw_type = static_cast<const arrow::FixedWidthType&>(*type);

data_len = arrow::bit_util::BytesForBits(num_records * fw_type.bit_width());

} else if (arrow::is_binary_like(type_id)) {

// we don't know the expected size for varlen output vectors.

data_len = 0;

} else {

return Status::Invalid("Unsupported output data type " + type->ToString());

}

ARROW_ASSIGN_OR_RAISE(auto data_buffer, arrow::AllocateResizableBuffer(data_len, pool));

// This is not strictly required but valgrind gets confused and detects this

// as uninitialized memory access. See arrow::util::SetBitTo().

if (type->id() == arrow::Type::BOOL) {

memset(data_buffer->mutable_data(), 0, data_len);

}

buffers.push_back(std::move(data_buffer));

*array_data = arrow::ArrayData::Make(type, num_records, std::move(buffers));

return Status::OK();

}

有点奇怪的是这部分内容没有没有配备test

lru_cache

从Boost库修改的LRU Cache,因为代码使用了模板,所以这里看不出来是存了什么

// modified from boost LRU cache -> the boost cache supported only an

// ordered map.

namespace gandiva {

// a cache which evicts the least recently used item when it is full

template <class Key, class Value>

class LruCache {

public:

using key_type = Key;

using value_type = Value;

using list_type = std::list<key_type>;

测试代码是直接使用string

TEST_F(TestLruCache, TestLruBehavior) {

cache_.insert(TestCacheKey(1), "hello");

cache_.insert(TestCacheKey(2), "hello");

cache_.get(TestCacheKey(1));

cache_.insert(TestCacheKey(3), "hello");

// should have evicted key 2.

ASSERT_EQ(*cache_.get(TestCacheKey(1)), "hello");

}

llvm_types

有一个llvm_types用于全局的types生成管理,用于映射Arrow的类型,这样的代码也能在NoisePage里面找到

class GANDIVA_EXPORT LLVMTypes {

public:

explicit LLVMTypes(llvm::LLVMContext& context);

llvm::Type* void_type() { return llvm::Type::getVoidTy(context_); }

llvm::Type* i1_type() { return llvm::Type::getInt1Ty(context_); }

llvm::Type* i8_type() { return llvm::Type::getInt8Ty(context_); }

llvm::Type* i16_type() { return llvm::Type::getInt16Ty(context_); }

llvm::Type* i32_type() { return llvm::Type::getInt32Ty(context_); }

llvm::Type* i64_type() { return llvm::Type::getInt64Ty(context_); }

llvm::Type* i128_type() { return llvm::Type::getInt128Ty(context_); }

llvm::StructType* i128_split_type() {

// struct with high/low bits (see decimal_ops.cc:DecimalSplit)

return llvm::StructType::get(context_, {i64_type(), i64_type()}, false);

}

以及一些简单的内容初始化

llvm::Constant* i128_zero() { return i128_constant(0); }

llvm::Constant* i128_one() { return i128_constant(1); }

相关测试代码

TEST_F(TestLLVMTypes, TestFound) {

EXPECT_EQ(types_->IRType(arrow::Type::BOOL), types_->i1_type());

EXPECT_EQ(types_->IRType(arrow::Type::INT32), types_->i32_type());

EXPECT_EQ(types_->IRType(arrow::Type::INT64), types_->i64_type());

EXPECT_EQ(types_->IRType(arrow::Type::FLOAT), types_->float_type());

EXPECT_EQ(types_->IRType(arrow::Type::DOUBLE), types_->double_type());

EXPECT_EQ(types_->IRType(arrow::Type::DATE64), types_->i64_type());

EXPECT_EQ(types_->IRType(arrow::Type::TIME64), types_->i64_type());

EXPECT_EQ(types_->IRType(arrow::Type::TIMESTAMP), types_->i64_type());

EXPECT_EQ(types_->DataVecType(arrow::boolean()), types_->i1_type());

EXPECT_EQ(types_->DataVecType(arrow::int32()), types_->i32_type());

EXPECT_EQ(types_->DataVecType(arrow::int64()), types_->i64_type());

EXPECT_EQ(types_->DataVecType(arrow::float32()), types_->float_type());

EXPECT_EQ(types_->DataVecType(arrow::float64()), types_->double_type());

EXPECT_EQ(types_->DataVecType(arrow::date64()), types_->i64_type());

EXPECT_EQ(types_->DataVecType(arrow::time64(arrow::TimeUnit::MICRO)),

types_->i64_type());

EXPECT_EQ(types_->DataVecType(arrow::timestamp(arrow::TimeUnit::MILLI)),

types_->i64_type());

}

TEST_F(TestLLVMTypes, TestNotFound) {

EXPECT_EQ(types_->IRType(arrow::Type::SPARSE_UNION), nullptr);

EXPECT_EQ(types_->IRType(arrow::Type::DENSE_UNION), nullptr);

EXPECT_EQ(types_->DataVecType(arrow::null()), nullptr);

}

llvm_includes

开头的关闭MSVC的警告可以记录以下,这是我头一回遇到,看以看出Gandiva是能在Windows上面运行的

#if defined(_MSC_VER)

# pragma warning(push)

# pragma warning(disable : 4141)

# pragma warning(disable : 4146)

# pragma warning(disable : 4244)

# pragma warning(disable : 4267)

# pragma warning(disable : 4291)

# pragma warning(disable : 4624)

#endif

甚至还考虑到了不同LLVM版本的情况

#if LLVM_VERSION_MAJOR >= 10

# define LLVM_ALIGN(alignment) (llvm::Align((alignment)))

#else

# define LLVM_ALIGN(alignment) (alignment)

#endif

llvm_generator

最为核心的LLVM代码生成

生成器似乎可以对缓存有效利用

class GANDIVA_EXPORT LLVMGenerator {

public:

/// \brief Factory method to initialize the generator.

static Result<std::unique_ptr<LLVMGenerator>> Make(

const std::shared_ptr<Configuration>& config, bool cached,

std::optional<std::reference_wrapper<GandivaObjectCache>> object_cache =

std::nullopt);

/// \brief Get the cache to be used for LLVM ObjectCache.

static std::shared_ptr<Cache<ExpressionCacheKey, std::shared_ptr<llvm::MemoryBuffer>>>

GetCache();

存储关于SelectionVector::Mode的信息

SelectionVector::Mode selection_vector_mode() { return selection_vector_mode_; }

build将表达式输入生成代码

/// \brief Build the code for the expression trees for default mode with a LLVM

/// ObjectCache. Each element in the vector represents an expression tree

Status Build(const ExpressionVector& exprs, SelectionVector::Mode mode);

/// \brief Build the code for the expression trees for default mode. Each

/// element in the vector represents an expression tree

Status Build(const ExpressionVector& exprs);

execute将Arrow量输入LLVM IR函数

/// \brief Execute the built expression against the provided arguments for

/// default mode.

Status Execute(const arrow::RecordBatch& record_batch,

const ArrayDataVector& output_vector) const;

/// \brief Execute the built expression against the provided arguments for

/// all modes. Only works on the records specified in the selection_vector.

Status Execute(const arrow::RecordBatch& record_batch,

const SelectionVector* selection_vector,

const ArrayDataVector& output_vector) const;

基本LLVMContextIRbuilder自然是少不了,但这里的创建Global String居然不用检查重复,不知道是疏忽,还是因为前边有检查😂

llvm::LLVMContext* context() { return engine_->context(); }

llvm::IRBuilder<>* ir_builder() { return engine_->ir_builder(); }

llvm::Constant* CreateGlobalStringPtr(const std::string& string) {

return engine_->CreateGlobalStringPtr(string);

}

然后Vistor模式重新过一遍解析树

class Visitor : public DexVisitor {

public:

Visitor(LLVMGenerator* generator, llvm::Function* function,

llvm::BasicBlock* entry_block, llvm::Value* arg_addrs,

llvm::Value* arg_local_bitmaps, llvm::Value* arg_holder_ptrs,

std::vector<llvm::Value*> slice_offsets, llvm::Value* arg_context_ptr,

llvm::Value* loop_var);

void Visit(const VectorReadValidityDex& dex) override;

void Visit(const VectorReadFixedLenValueDex& dex) override;

void Visit(const VectorReadVarLenValueDex& dex) override;

void Visit(const LocalBitMapValidityDex& dex) override;

void Visit(const TrueDex& dex) override;

void Visit(const FalseDex& dex) override;

void Visit(const LiteralDex& dex) override;

void Visit(const NonNullableFuncDex& dex) override;

void Visit(const NullableNeverFuncDex& dex) override;

void Visit(const NullableInternalFuncDex& dex) override;

void Visit(const IfDex& dex) override;

void Visit(const BooleanAndDex& dex) override;

void Visit(const BooleanOrDex& dex) override;

void Visit(const InExprDexBase<int32_t>& dex) override;

void Visit(const InExprDexBase<int64_t>& dex) override;

void Visit(const InExprDexBase<float>& dex) override;

void Visit(const InExprDexBase<double>& dex) override;

void Visit(const InExprDexBase<gandiva::DecimalScalar128>& dex) override;

void Visit(const InExprDexBase<std::string>& dex) override;

template <typename Type>

void VisitInExpression(const InExprDexBase<Type>& dex);

LValuePtr result() { return result_; }

bool has_arena_allocs() { return has_arena_allocs_; }

还有专门关于LLVM函数生成与函数调用的函数

std::vector<llvm::Value*> BuildParams(int holder_idx,

const ValueValidityPairVector& args,

bool with_validity, bool with_context);

// Generate code to invoke a function call.

LValuePtr BuildFunctionCall(const NativeFunction* func, DataTypePtr arrow_return_type,

std::vector<llvm::Value*>* params);

// Generate code for an if-else condition.

LValuePtr BuildIfElse(llvm::Value* condition, std::function<LValuePtr()> then_func,

std::function<LValuePtr()> else_func,

DataTypePtr arrow_return_type);

通过接口添加预定义的LLVM IR函数

/// Generate code to make a function call (to a pre-compiled IR function) which takes

/// 'args' and has a return type 'ret_type'.

llvm::Value* AddFunctionCall(const std::string& full_name, llvm::Type* ret_type,

const std::vector<llvm::Value*>& args);

关于Cache的详细实现

std::shared_ptr<Cache<ExpressionCacheKey, std::shared_ptr<llvm::MemoryBuffer>>>

LLVMGenerator::GetCache() {

static std::shared_ptr<Cache<ExpressionCacheKey, std::shared_ptr<llvm::MemoryBuffer>>>

shared_cache = std::make_shared<

Cache<ExpressionCacheKey, std::shared_ptr<llvm::MemoryBuffer>>>();

return shared_cache;

}

Status LLVMGenerator::SetLLVMObjectCache(GandivaObjectCache& object_cache) {

return engine_->SetLLVMObjectCache(object_cache);

}

build的部分实现

Status LLVMGenerator::Build(const ExpressionVector& exprs, SelectionVector::Mode mode) {

selection_vector_mode_ = mode;

for (auto& expr : exprs) {

auto output = annotator_.AddOutputFieldDescriptor(expr->result());

ARROW_RETURN_NOT_OK(Add(expr, output));

}

// Compile and inject into the process' memory the generated function.

ARROW_RETURN_NOT_OK(engine_->FinalizeModule());

// setup the jit functions for each expression.

for (auto& compiled_expr : compiled_exprs_) {

auto fn_name = compiled_expr->GetFunctionName(mode);

ARROW_ASSIGN_OR_RAISE(auto fn_ptr, engine_->CompiledFunction(fn_name));

auto jit_fn = reinterpret_cast<EvalFunc>(fn_ptr);

compiled_expr->SetJITFunction(selection_vector_mode_, jit_fn);

}

return Status::OK();

}

这部分的详细内容有空的话值得细看,而关于Test的话,这边给的示范样例是LLVM自动向量化向量加

TEST_F(TestLLVMGenerator, TestAdd) {

// Setup LLVM generator to do an arithmetic add of two vectors

ASSERT_OK_AND_ASSIGN(auto generator,

LLVMGenerator::Make(TestConfigWithIrDumping(), false));

Annotator annotator;

auto field0 = std::make_shared<arrow::Field>("f0", arrow::int32());

auto desc0 = annotator.CheckAndAddInputFieldDescriptor(field0);

auto validity_dex0 = std::make_shared<VectorReadValidityDex>(desc0);

auto value_dex0 = std::make_shared<VectorReadFixedLenValueDex>(desc0);

auto pair0 = std::make_shared<ValueValidityPair>(validity_dex0, value_dex0);

auto field1 = std::make_shared<arrow::Field>("f1", arrow::int32());

auto desc1 = annotator.CheckAndAddInputFieldDescriptor(field1);

auto validity_dex1 = std::make_shared<VectorReadValidityDex>(desc1);

auto value_dex1 = std::make_shared<VectorReadFixedLenValueDex>(desc1);

auto pair1 = std::make_shared<ValueValidityPair>(validity_dex1, value_dex1);

DataTypeVector params{arrow::int32(), arrow::int32()};

auto func_desc = std::make_shared<FuncDescriptor>("add", params, arrow::int32());

FunctionSignature signature(func_desc->name(), func_desc->params(),

func_desc->return_type());

const NativeFunction* native_func =

generator->function_registry_->LookupSignature(signature);

std::vector<ValueValidityPairPtr> pairs{pair0, pair1};

auto func_dex = std::make_shared<NonNullableFuncDex>(

func_desc, native_func, FunctionHolderPtr(nullptr), -1, pairs);

auto field_sum = std::make_shared<arrow::Field>("out", arrow::int32());

auto desc_sum = annotator.CheckAndAddInputFieldDescriptor(field_sum);

// LLVM 10 doesn't like the expr function name to be the same as the module name when

// LLJIT is used

std::string fn_name = "llvm_gen_test_add_expr";

ASSERT_OK(generator->engine_->LoadFunctionIRs());

ASSERT_OK(generator->CodeGenExprValue(func_dex, 4, desc_sum, 0, fn_name,

SelectionVector::MODE_NONE));

ASSERT_OK(generator->engine_->FinalizeModule());

auto const& ir = generator->engine_->ir();

EXPECT_THAT(ir, testing::HasSubstr("vector.body"));

ASSERT_OK_AND_ASSIGN(auto fn_ptr, generator->engine_->CompiledFunction(fn_name));

ASSERT_TRUE(fn_ptr);

auto eval_func = reinterpret_cast<EvalFunc>(fn_ptr);

constexpr size_t kNumRecords = 4;

std::array<uint32_t, kNumRecords> a0{1, 2, 3, 4};

std::array<uint32_t, kNumRecords> a1{5, 6, 7, 8};

uint64_t in_bitmap = 0xffffffffffffffffull;

std::array<uint32_t, kNumRecords> out{0, 0, 0, 0};

uint64_t out_bitmap = 0;

std::array<uint8_t*, 6> addrs{

reinterpret_cast<uint8_t*>(a0.data()), reinterpret_cast<uint8_t*>(&in_bitmap),

reinterpret_cast<uint8_t*>(a1.data()), reinterpret_cast<uint8_t*>(&in_bitmap),

reinterpret_cast<uint8_t*>(out.data()), reinterpret_cast<uint8_t*>(&out_bitmap),

};

std::array<int64_t, 6> addr_offsets{0, 0, 0, 0, 0, 0};

eval_func(addrs.data(), addr_offsets.data(), nullptr, nullptr, nullptr,

0 /* dummy context ptr */, kNumRecords);

EXPECT_THAT(out, testing::ElementsAre(6, 8, 10, 12));

EXPECT_EQ(out_bitmap, 0ULL);

}

这一块可以看下GPT的解析

这个 TestAdd 测试用例是在测试 LLVM 代码生成器生成向量加法运算的完整流程。让我详细解释:

测试目标

验证 Gandiva 能否正确生成并执行两个 int32 向量相加的 LLVM 代码。

测试流程分析

1. 初始化 LLVM 生成器

ASSERT_OK_AND_ASSIGN(auto generator, LLVMGenerator::Make(TestConfigWithIrDumping(), false));

  • 创建支持 IR 转储的 LLVM 代码生成器

2. 设置输入向量描述符

// 第一个输入向量 f0 (int32)

auto field0 = std::make_shared<arrow::Field>("f0", arrow::int32());

auto desc0 = annotator.CheckAndAddInputFieldDescriptor(field0);

auto validity_dex0 = std::make_shared<VectorReadValidityDex>(desc0); // 有效性位图

auto value_dex0 = std::make_shared<VectorReadFixedLenValueDex>(desc0); // 数据值

auto pair0 = std::make_shared<ValueValidityPair>(validity_dex0, value_dex0);

// 第二个输入向量 f1 (int32) - 类似的设置

3. 定义加法函数

DataTypeVector params{arrow::int32(), arrow::int32()};

auto func_desc = std::make_shared<FuncDescriptor>("add", params, arrow::int32());

  • 定义一个接受两个 int32 参数、返回 int32 的加法函数

4. 创建函数表达式

auto func_dex = std::make_shared<NonNullableFuncDex>(

func_desc, native_func, FunctionHolderPtr(nullptr), -1, pairs);

  • 将输入向量对与加法函数绑定

5. 生成 LLVM 代码

ASSERT_OK(generator->CodeGenExprValue(func_dex, 4, desc_sum, 0, fn_name,

SelectionVector::MODE_NONE));

  • 为表达式生成 LLVM IR 代码
  • 处理 4 条记录

6. 编译并获取函数指针

ASSERT_OK_AND_ASSIGN(auto fn_ptr, generator->engine_->CompiledFunction(fn_name));

auto eval_func = reinterpret_cast<EvalFunc>(fn_ptr);

  • 将 LLVM IR 编译为机器码
  • 获取可执行的函数指针

7. 准备测试数据

std::array<uint32_t, kNumRecords> a0{1, 2, 3, 4}; // 第一个向量

std::array<uint32_t, kNumRecords> a1{5, 6, 7, 8}; // 第二个向量

uint64_t in_bitmap = 0xffffffffffffffffull; // 全部有效

std::array<uint32_t, kNumRecords> out{0, 0, 0, 0}; // 输出向量

8. 执行生成的函数

eval_func(addrs.data(), addr_offsets.data(), nullptr, nullptr, nullptr,

0 /* dummy context ptr */, kNumRecords);

  • 调用编译后的函数执行向量加法

9. 验证结果

EXPECT_THAT(out, testing::ElementsAre(6, 8, 10, 12)); // 1+5, 2+6, 3+7, 4+8

EXPECT_EQ(out_bitmap, 0ULL); // 输出位图应为0(全部有效)

关键技术点

1. 向量化优化验证

EXPECT_THAT(ir, testing::HasSubstr("vector.body"));

  • 检查生成的 LLVM IR 中包含向量化循环体
  • 确保 LLVM 进行了向量化优化

2. 内存布局

std::array<uint8_t*, 6> addrs{

reinterpret_cast<uint8_t*>(a0.data()), // 第一个向量数据

reinterpret_cast<uint8_t*>(&in_bitmap), // 第一个向量有效性位图

reinterpret_cast<uint8_t*>(a1.data()), // 第二个向量数据

reinterpret_cast<uint8_t*>(&in_bitmap), // 第二个向量有效性位图

reinterpret_cast<uint8_t*>(out.data()), // 输出向量数据

reinterpret_cast<uint8_t*>(&out_bitmap) // 输出向量有效性位图

};

测试意义

这个测试验证了 Gandiva 的核心功能:

  1. 表达式解析:将高级表达式转换为内部表示
  2. 代码生成:生成高效的 LLVM IR 代码
  3. 编译执行:将 IR 编译为机器码并执行
  4. 向量化优化:确保生成的代码利用了 SIMD 指令
  5. 正确性:验证计算结果的准确性

这是一个端到端的集成测试,确保整个代码生成和执行流水线工作正常。

这里的testing::HasSubstr是GMock的组件

这里可以看到,C语言函数是可以直接register上去的

TEST_F(TestLLVMGenerator, VerifyExtendedCFunctions) {

VerifyFunctionMapping("multiply_by_three_int32", [](auto registry) {

return TestConfigWithCFunction(std::move(registry));

});

//test_util.cc

std::shared_ptr<Configuration> TestConfigWithCFunction(

std::shared_ptr<FunctionRegistry> registry) {

return BuildConfigurationWithRegistry(std::move(registry), [](auto reg) {

return reg->Register(GetTestExternalCFunction(),

reinterpret_cast<void*>(multiply_by_three));

});

}

static int64_t multiply_by_three(int32_t value) { return value * 3; }

literal_holder

Gandiva 中统一表示和处理各种类型的常量值

namespace gandiva {

using LiteralHolder =

std::variant<bool, float, double, int8_t, int16_t, int32_t, int64_t, uint8_t,

uint16_t, uint32_t, uint64_t, std::string, DecimalScalar128>;

GANDIVA_EXPORT std::string ToString(const LiteralHolder& holder);

} // namespace gandiva

std::variant 是 C++17 引入的一个类型安全的联合体(type-safe union),它可以在运行时保存一个多个预设类型中的一个值,但不会像传统的 union 那样不安全。

Rust 的 enum 枚举类型std::variant 的更强版本

Interval_holder

处理各类时间间隔

// Pass only years and days to cast

data = "P12Y15D";

response = cast_interval_day(&execution_context_, data.data(), 7, true, &out_valid);

qty_days_in_response = 15;

qty_millis_in_response = 0;

EXPECT_TRUE(out_valid);

EXPECT_FALSE(execution_context_.has_error());

EXPECT_EQ(response, (qty_millis_in_response << 32) | qty_days_in_response);

hash_utils

hash组件用的是OpenSSL,主要是关于Sha类,Md5l类函数

GANDIVA_EXPORT

const char* gdv_sha512_hash(int64_t context, const void* message, size_t message_length,

int32_t* out_length) {

constexpr int sha512_result_length = 128;

return gdv_hash_using_openssl(context, message, message_length, EVP_sha512(),

sha512_result_length, out_length);

}

/// Hashes a generic message using the SHA256 algorithm

GANDIVA_EXPORT

const char* gdv_sha256_hash(int64_t context, const void* message, size_t message_length,

int32_t* out_length) {

constexpr int sha256_result_length = 64;

return gdv_hash_using_openssl(context, message, message_length, EVP_sha256(),

sha256_result_length, out_length);

}

/// Hashes a generic message using the SHA1 algorithm

GANDIVA_EXPORT

const char* gdv_sha1_hash(int64_t context, const void* message, size_t message_length,

int32_t* out_length) {

constexpr int sha1_result_length = 40;

return gdv_hash_using_openssl(context, message, message_length, EVP_sha1(),

sha1_result_length, out_length);

}

GANDIVA_EXPORT

const char* gdv_md5_hash(int64_t context, const void* message, size_t message_length,

int32_t* out_length) {

constexpr int md5_result_length = 32;

return gdv_hash_using_openssl(context, message, message_length, EVP_md5(),

md5_result_length, out_length);

}

gandiva_object_cache

直接对result1 = evaluate("column1 + column2 * 3");这类操作的结果进行缓存,相关操作继承自llvm::ObjectCache,使用llvm::memorybuffer缓存相关代码

class GandivaObjectCache : public llvm::ObjectCache {

public:

explicit GandivaObjectCache(

std::shared_ptr<Cache<ExpressionCacheKey, std::shared_ptr<llvm::MemoryBuffer>>>&

cache,

ExpressionCacheKey key);

~GandivaObjectCache() {}

void notifyObjectCompiled(const llvm::Module* M, llvm::MemoryBufferRef Obj);

std::unique_ptr<llvm::MemoryBuffer> getObject(const llvm::Module* M);

private:

ExpressionCacheKey cache_key_;

std::shared_ptr<Cache<ExpressionCacheKey, std::shared_ptr<llvm::MemoryBuffer>>> cache_;

};

function_signature

给函数上Hash,我猜应该是缓存记录

EXPECT_EQ(FunctionSignature("extract_month", {arrow::date32()}, arrow::int64()),

FunctionSignature("extract_month", {local_date32_type_}, local_i64_type_));

TEST_F(TestFunctionSignature, TestHash) {

FunctionSignature f1("add", {arrow::int32(), arrow::int32()}, arrow::int64());

FunctionSignature f2("add", {local_i32_type_, local_i32_type_}, local_i64_type_);

EXPECT_EQ(f1.Hash(), f2.Hash());

FunctionSignature f3("extractDay", {arrow::int64()}, arrow::int64());

FunctionSignature f4("extractday", {arrow::int64()}, arrow::int64());

EXPECT_EQ(f3.Hash(), f4.Hash());

}

function_register

class GANDIVA_EXPORT FunctionRegistry {

public:

using iterator = const NativeFunction*;

using FunctionHolderMaker =

std::function<arrow::Result<std::shared_ptr<FunctionHolder>>(

const FunctionNode& function_node)>;

FunctionRegistry();

FunctionRegistry(const FunctionRegistry&) = delete;

FunctionRegistry& operator=(const FunctionRegistry&) = delete;

/// Lookup a pre-compiled function by its signature.

const NativeFunction* LookupSignature(const FunctionSignature& signature) const;

/// \brief register a set of functions into the function registry from a given bitcode

/// file

arrow::Status Register(const std::vector<NativeFunction>& funcs,

const std::string& bitcode_path);

/// \brief register a set of functions into the function registry from a given bitcode

/// buffer

arrow::Status Register(const std::vector<NativeFunction>& funcs,

std::shared_ptr<arrow::Buffer> bitcode_buffer);

/// \brief register a C function into the function registry

/// @param func the registered function's metadata

/// @param c_function_ptr the function pointer to the

/// registered function's implementation

/// @param function_holder_maker this will be used as the function holder if the

/// function requires a function holder

arrow::Status Register(

NativeFunction func, void* c_function_ptr,

std::optional<FunctionHolderMaker> function_holder_maker = std::nullopt);

/// \brief get a list of bitcode memory buffers saved in the registry

const std::vector<std::shared_ptr<arrow::Buffer>>& GetBitcodeBuffers() const;

/// \brief get a list of C functions saved in the registry

const std::vector<std::pair<NativeFunction, void*>>& GetCFunctions() const;

const FunctionHolderMakerRegistry& GetFunctionHolderMakerRegistry() const;

iterator begin() const;

iterator end() const;

iterator back() const;

friend arrow::Result<std::shared_ptr<FunctionRegistry>> MakeDefaultFunctionRegistry();

private:

std::vector<NativeFunction> pc_registry_;

SignatureMap pc_registry_map_;

std::vector<std::shared_ptr<arrow::Buffer>> bitcode_memory_buffers_;

std::vector<std::pair<NativeFunction, void*>> c_functions_;

FunctionHolderMakerRegistry holder_maker_registry_;

Status Add(NativeFunction func);

};

/// \brief get the default function registry

GANDIVA_EXPORT std::shared_ptr<FunctionRegistry> default_function_registry();

} // namespace gandiva

function_ir_builder

一个十分通用的IR生成器(这玩意我怎么之前没想到过呢.jpg),甚至能实现If-else的block块跳转

class FunctionIRBuilder {

public:

explicit FunctionIRBuilder(Engine* engine) : engine_(engine) {}

virtual ~FunctionIRBuilder() = default;

protected:

LLVMTypes* types() { return engine_->types(); }

llvm::Module* module() { return engine_->module(); }

llvm::LLVMContext* context() { return engine_->context(); }

llvm::IRBuilder<>* ir_builder() { return engine_->ir_builder(); }

llvm::Constant* CreateGlobalStringPtr(const std::string& string) {

return engine_->CreateGlobalStringPtr(string);

}

/// Build an if-else block.

llvm::Value* BuildIfElse(llvm::Value* condition, llvm::Type* return_type,

std::function<llvm::Value*()> then_func,

std::function<llvm::Value*()> else_func);

struct NamedArg {

std::string name;

llvm::Type* type;

};

/// Build llvm fn.

llvm::Function* BuildFunction(const std::string& function_name, llvm::Type* return_type,

std::vector<NamedArg> in_args);

private:

Engine* engine_;

};

filter

这部分也是在LLVM中实现,看起来和Project差不多

private:

std::unique_ptr<LLVMGenerator> llvm_generator_;

SchemaPtr schema_;

std::shared_ptr<Configuration> configuration_;

bool built_from_cache_;

如果想要添加缓存,直接SetLLVMObjectCache即可

Status Engine::SetLLVMObjectCache(GandivaObjectCache& object_cache) {

auto cached_buffer = object_cache.getObject(nullptr);

if (cached_buffer) {

auto error = lljit_->addObjectFile(std::move(cached_buffer));

if (error) {

return Status::CodeGenError("Failed to add cached object file to LLJIT: ",

llvm::toString(std::move(error)));

}

}

return Status::OK();

}

在PassManager里面可以挂上Optimize

static void OptimizeModuleWithNewPassManager(llvm::Module& module,

llvm::TargetIRAnalysis target_analysis) {

// Setup an optimiser pipeline

llvm::PassBuilder pass_builder;

llvm::LoopAnalysisManager loop_am;

llvm::FunctionAnalysisManager function_am;

llvm::CGSCCAnalysisManager cgscc_am;

llvm::ModuleAnalysisManager module_am;

function_am.registerPass([&] { return target_analysis; });

// Register required analysis managers

pass_builder.registerModuleAnalyses(module_am);

pass_builder.registerCGSCCAnalyses(cgscc_am);

pass_builder.registerFunctionAnalyses(function_am);

pass_builder.registerLoopAnalyses(loop_am);

pass_builder.crossRegisterProxies(loop_am, function_am, cgscc_am, module_am);

pass_builder.registerPipelineStartEPCallback([&](llvm::ModulePassManager& module_pm,

llvm::OptimizationLevel Level) {

module_pm.addPass(llvm::ModuleInlinerPass());

llvm::FunctionPassManager function_pm;

function_pm.addPass(llvm::InstCombinePass());

function_pm.addPass(llvm::PromotePass());

function_pm.addPass(llvm::GVNPass());

function_pm.addPass(llvm::NewGVNPass());

function_pm.addPass(llvm::SimplifyCFGPass());

function_pm.addPass(llvm::LoopVectorizePass());

function_pm.addPass(llvm::SLPVectorizerPass());

module_pm.addPass(llvm::createModuleToFunctionPassAdaptor(std::move(function_pm)));

module_pm.addPass(llvm::GlobalOptPass());

});

engine

关于LLVM Engine的配置基本都在engine.hengine.ccengine_llvm_test.cc里面,还可以加载预编译好LLVM IR

/// load pre-compiled IR modules from precompiled_bitcode.cc and merge them into

/// the main module.

Status LoadPreCompiledIR();

// load external pre-compiled bitcodes into module

Status LoadExternalPreCompiledIR();

// Create and add mappings for cpp functions that can be accessed from LLVM.

arrow::Status AddGlobalMappings();

// Remove unused functions to reduce compile time.

Status RemoveUnusedFunctions();

std::unique_ptr<llvm::LLVMContext> context_;

std::unique_ptr<llvm::orc::LLJIT> lljit_;

std::unique_ptr<llvm::IRBuilder<>> ir_builder_;

std::unique_ptr<llvm::Module> module_;

LLVMTypes types_;

std::vector<std::string> functions_to_compile_;

bool optimize_ = true;

bool module_finalized_ = false;

bool cached_;

bool functions_loaded_ = false;

std::shared_ptr<FunctionRegistry> function_registry_;

std::string module_ir_;

std::unique_ptr<llvm::TargetMachine> target_machine_;

const std::shared_ptr<Configuration> conf_;

};

encrypt

Gandiva里面有加密套件的相关设置(但是却没看到文档关于如何使用的),其使用的AES加密也来自OpenSSL组件

GANDIVA_EXPORT

int32_t aes_encrypt(const char* plaintext, int32_t plaintext_len, const char* key,

int32_t key_len, unsigned char* cipher);

/**

* Decrypt data using aes algorithm

**/

GANDIVA_EXPORT

int32_t aes_decrypt(const char* ciphertext, int32_t ciphertext_len, const char* key,

int32_t key_len, unsigned char* plaintext);

具体的Test

TEST(TestShaEncryptUtils, TestAesEncryptDecrypt) {

// 16 bytes key

auto* key = "12345678abcdefgh";

auto* to_encrypt = "some test string";

auto key_len = static_cast<int32_t>(strlen(reinterpret_cast<const char*>(key)));

auto to_encrypt_len =

static_cast<int32_t>(strlen(reinterpret_cast<const char*>(to_encrypt)));

unsigned char cipher_1[64];

int32_t cipher_1_len =

gandiva::aes_encrypt(to_encrypt, to_encrypt_len, key, key_len, cipher_1);

unsigned char decrypted_1[64];

int32_t decrypted_1_len = gandiva::aes_decrypt(reinterpret_cast<const char*>(cipher_1),

cipher_1_len, key, key_len, decrypted_1);

EXPECT_EQ(std::string(reinterpret_cast<const char*>(to_encrypt), to_encrypt_len),

std::string(reinterpret_cast<const char*>(decrypted_1), decrypted_1_len));

decimal_ir

对于浮点数代码的生成进行了特别的处理,看来这里面坑不小😂

class DecimalIR : public FunctionIRBuilder {

public:

explicit DecimalIR(Engine* engine)

: FunctionIRBuilder(engine), enable_ir_traces_(false) {}

/// Build decimal IR functions and add them to the engine.

static Status AddFunctions(Engine* engine);

void EnableTraces() { enable_ir_traces_ = true; }

llvm::Value* CallDecimalFunction(const std::string& function_name,

llvm::Type* return_type,

const std::vector<llvm::Value*>& args);

private:

/// The intrinsic fn for divide with small divisors is about 10x slower, so not

/// using these.

static const bool kUseOverflowIntrinsics = false;

// Holder for an i128 value, along with its with scale and precision.

class ValueFull {

public:

ValueFull(llvm::Value* value, llvm::Value* precision, llvm::Value* scale)

: value_(value), precision_(precision), scale_(scale) {}

llvm::Value* value() const { return value_; }

llvm::Value* precision() const { return precision_; }

llvm::Value* scale() const { return scale_; }

private:

llvm::Value* value_;

llvm::Value* precision_;

llvm::Value* scale_;

};

// Holder for an i128 value, and a boolean indicating overflow.

class ValueWithOverflow {

public:

ValueWithOverflow(llvm::Value* value, llvm::Value* overflow)

: value_(value), overflow_(overflow) {}

// Make from IR struct

static ValueWithOverflow MakeFromStruct(DecimalIR* decimal_ir, llvm::Value* dstruct);

// Build a corresponding IR struct

llvm::Value* AsStruct(DecimalIR* decimal_ir) const;

llvm::Value* value() const { return value_; }

llvm::Value* overflow() const { return overflow_; }

private:

llvm::Value* value_;

llvm::Value* overflow_;

};

附录:Arrow类型与LLVM类型的映射

Gandiva 类型(arrow 数据类型)C 函数类型
int8int8_t
int16int16_t
int32int32_t
int64int64_t
uint8uint8_t
uint16uint16_t
uint32uint32_t
uint64uint64_t
float32float
float64double
booleanbool
date32int32_t
date64int64_t
timestampint64_t
time32int32_t
time64int64_t
interval_monthint32_t
interval_day_timeint64_t
utf8(作为参数类型)const char*、uint32_t
utf8(作为返回类型)int64_t context、const char*、uint32_t*
binary(作为参数类型)const char*、uint32_t
utf8(作为返回类型)int64_t context、const char*、uint32_t*

总结

抛开不知道为什么项目文件不区分文件夹的问题,项目代码质量很高,关键点的注释和测试样例可以让人理解Gandiva做的事情,很有意思。

虽然有提供Gandiva外部函数的相关手册,但具体怎么用的话,还是要是要看测试样例