🍊柑橘 RSS 阅读器 - 鸿蒙首款原生 RSS 阅读器

最初知道Apache Arrow Gandiva是无意间看Arrow项目的时候看到的，冲着项目主页上的LLVM，JIT的字样，我还实际尝试在Ubuntu安装和运行了下，但最后因为实在想不清楚，在什么场景下能用上，就弃坑了😂

直到前几天，我读完NoisePage的论文和部分源码，总感觉Arrow和LLVM的结合在哪里见到过——就是Apache Arrow Gandiva，那干脆这回一并把源码看了，搞清楚这东西到底是什么

项目历史&现状简述

该项目由Dremio在2018年捐给Apache Arrow，现作为Apache Arrow的子项目之一（信息来源：Gandiva: A LLVM-based Analytical Expression Compiler for Apache Arrow）如果你再进一步深究的话，会发现Arrow当中有不少人现在就在Dremio中工作，而Dremio项目也使用Apache Arrow，而Gandiva则宣称为Dremio执行引擎的一部分

Gandiva最大的亮点是使用LLVM的自动向量化完成Arrow的向量化处理，而在LLVM部分当中，还实现了Project和Filter——这里如果加上Join和Aggregation操作，很多SQL操作就齐活了，如果你再把NoisePage算上的话，甚至能完成整套纯LLVM的Arrow CURD处理机制

虽然网传这个项目烂尾（根本就没这回事好吧😅），但事实是Gandiva一直都有commit进行维护，今年LLVM20出来以后也很快做了跟进

目前Gandiva有C和C++的相关库，但对于Rust版本的Arrow似乎就不提供相关支持了：Interfaces for gandiva bindings.

源码解析

代码下载于2025.6.24，所有代码均平铺在单层目录上

Gandiva源码的地址：https://github.com/apache/arrow/tree/main/cpp/src/gandiva

|-- CMakeLists.txt
|-- GandivaConfig.cmake.in
|-- annotator.cc
|-- annotator.h
|-- annotator_test.cc
|-- arrow.h
|-- basic_decimal_scalar.h
|-- bitmap_accumulator.cc
|-- bitmap_accumulator.h
|-- bitmap_accumulator_test.cc
|-- cache.cc
|-- cache.h
|-- cache_test.cc
|-- cast_time.cc
|-- compiled_expr.h
|-- condition.h
|-- configuration.cc
|-- configuration.h
|-- context_helper.cc
|-- date_utils.cc
|-- date_utils.h
|-- decimal_ir.cc
|-- decimal_ir.h
|-- decimal_scalar.h
|-- decimal_type_util.cc
|-- decimal_type_util.h
|-- decimal_type_util_test.cc
|-- decimal_xlarge.cc
|-- decimal_xlarge.h
|-- dex.h
|-- dex_visitor.h
|-- encrypt_utils.cc
|-- encrypt_utils.h
|-- encrypt_utils_test.cc
|-- engine.cc
|-- engine.h
|-- engine_llvm_test.cc
|-- eval_batch.h
|-- execution_context.h
|-- exported_funcs.cc
|-- exported_funcs.h
|-- exported_funcs_registry.cc
|-- exported_funcs_registry.h
|-- exported_funcs_registry_test.cc
|-- expr_decomposer.cc
|-- expr_decomposer.h
|-- expr_decomposer_test.cc
|-- expr_validator.cc
|-- expr_validator.h
|-- expression.cc
|-- expression.h
|-- expression_cache_key.h
|-- expression_registry.cc
|-- expression_registry.h
|-- expression_registry_test.cc
|-- external_c_functions.cc
|-- field_descriptor.h
|-- filter.cc
|-- filter.h
|-- formatting_utils.h
|-- func_descriptor.h
|-- function_holder.h
|-- function_holder_maker_registry.cc
|-- function_holder_maker_registry.h
|-- function_ir_builder.cc
|-- function_ir_builder.h
|-- function_registry.cc
|-- function_registry.h
|-- function_registry_arithmetic.cc
|-- function_registry_arithmetic.h
|-- function_registry_common.h
|-- function_registry_datetime.cc
|-- function_registry_datetime.h
|-- function_registry_hash.cc
|-- function_registry_hash.h
|-- function_registry_math_ops.cc
|-- function_registry_math_ops.h
|-- function_registry_string.cc
|-- function_registry_string.h
|-- function_registry_test.cc
|-- function_registry_timestamp_arithmetic.cc
|-- function_registry_timestamp_arithmetic.h
|-- function_signature.cc
|-- function_signature.h
|-- function_signature_test.cc
|-- gandiva.pc.in
|-- gandiva_aliases.h
|-- gandiva_object_cache.cc
|-- gandiva_object_cache.h
|-- gdv_function_stubs.cc
|-- gdv_function_stubs.h
|-- gdv_function_stubs_test.cc
|-- gdv_hash_function_stubs.cc
|-- gdv_string_function_stubs.cc
|-- hash_utils.cc
|-- hash_utils.h
|-- hash_utils_test.cc
|-- in_holder.h
|-- interval_holder.cc
|-- interval_holder.h
|-- interval_holder_test.cc
|-- literal_holder.cc
|-- literal_holder.h
|-- llvm_generator.cc
|-- llvm_generator.h
|-- llvm_generator_test.cc
|-- llvm_includes.h
|-- llvm_types.cc
|-- llvm_types.h
|-- llvm_types_test.cc
|-- local_bitmaps_holder.h
|-- lru_cache.h
|-- lru_cache_test.cc
|-- lvalue.h
|-- make_precompiled_bitcode.py
|-- native_function.h
|-- node.h
|-- node_visitor.h
|-- precompiled
|   |-- CMakeLists.txt
|   |-- arithmetic_ops.cc
|   |-- arithmetic_ops_test.cc
|   |-- bitmap.cc
|   |-- bitmap_test.cc
|   |-- decimal_ops.cc
|   |-- decimal_ops.h
|   |-- decimal_ops_test.cc
|   |-- decimal_wrapper.cc
|   |-- epoch_time_point.h
|   |-- epoch_time_point_test.cc
|   |-- extended_math_ops.cc
|   |-- extended_math_ops_test.cc
|   |-- hash.cc
|   |-- hash_test.cc
|   |-- print.cc
|   |-- string_ops.cc
|   |-- string_ops_test.cc
|   |-- testing.h
|   |-- time.cc
|   |-- time_constants.h
|   |-- time_fields.h
|   |-- time_test.cc
|   |-- timestamp_arithmetic.cc
|   `-- types.h
|-- precompiled_bitcode.cc.in
|-- projector.cc
|-- projector.h
|-- random_generator_holder.cc
|-- random_generator_holder.h
|-- random_generator_holder_test.cc
|-- regex_functions_holder.cc
|-- regex_functions_holder.h
|-- regex_functions_holder_test.cc
|-- regex_util.cc
|-- regex_util.h
|-- selection_vector.cc
|-- selection_vector.h
|-- selection_vector_impl.h
|-- selection_vector_test.cc
|-- simple_arena.h
|-- simple_arena_test.cc
|-- symbols.map
|-- tests
|   |-- CMakeLists.txt
|   |-- binary_test.cc
|   |-- boolean_expr_test.cc
|   |-- date_time_test.cc
|   |-- decimal_single_test.cc
|   |-- decimal_test.cc
|   |-- external_functions
|   |   |-- CMakeLists.txt
|   |   |-- multiply_by_two.cc
|   |   `-- multiply_by_two.h
|   |-- filter_project_test.cc
|   |-- filter_test.cc
|   |-- generate_data.h
|   |-- hash_test.cc
|   |-- huge_table_test.cc
|   |-- if_expr_test.cc
|   |-- in_expr_test.cc
|   |-- literal_test.cc
|   |-- micro_benchmarks.cc
|   |-- null_validity_test.cc
|   |-- projector_build_validation_test.cc
|   |-- projector_test.cc
|   |-- test_util.cc
|   |-- test_util.h
|   |-- timed_evaluate.h
|   |-- to_string_test.cc
|   `-- utf8_test.cc
|-- to_date_holder.cc
|-- to_date_holder.h
|-- to_date_holder_test.cc
|-- tree_expr_builder.cc
|-- tree_expr_builder.h
|-- tree_expr_test.cc
|-- value_validity_pair.h
`-- visibility.h

由于代码量极大，只选取部分进行分析

node

关于Tree的Node的定义

namespace gandiva {
class FieldNode;
class FunctionNode;
class IfNode;
class LiteralNode;
class BooleanNode;
template <typename Type>
class InExpressionNode;
/// \brief Visitor for nodes in the expression tree.
class GANDIVA_EXPORT NodeVisitor {
 public:
  virtual ~NodeVisitor() = default;
  virtual Status Visit(const FieldNode& node) = 0;
  virtual Status Visit(const FunctionNode& node) = 0;
  virtual Status Visit(const IfNode& node) = 0;
  virtual Status Visit(const LiteralNode& node) = 0;
  virtual Status Visit(const BooleanNode& node) = 0;
  virtual Status Visit(const InExpressionNode<int32_t>& node) = 0;
  virtual Status Visit(const InExpressionNode<int64_t>& node) = 0;
  virtual Status Visit(const InExpressionNode<float>& node) = 0;
  virtual Status Visit(const InExpressionNode<double>& node) = 0;
  virtual Status Visit(const InExpressionNode<gandiva::DecimalScalar128>& node) = 0;
  virtual Status Visit(const InExpressionNode<std::string>& node) = 0;
};
}  // namespace gandiva

tree_expr

tree_expr_test.cc
tree_expr_builder.cc
tree_expr_builder.h

用于解析计算树，比如4*5+3这种，通过TreeExprBuilder完成树的构建

TEST_F(TestExprTree, TestField) {
  Annotator annotator;
  auto n0 = TreeExprBuilder::MakeField(i0_);
  EXPECT_EQ(n0->return_type(), int32());
  auto n1 = TreeExprBuilder::MakeField(b0_);
  EXPECT_EQ(n1->return_type(), boolean());
  ExprDecomposer decomposer(*registry_, annotator);
  ValueValidityPairPtr pair;
  auto status = decomposer.Decompose(*n1, &pair);
  DCHECK_EQ(status.ok(), true) << status.message();
  auto value = pair->value_expr();
  auto value_dex = std::dynamic_pointer_cast<VectorReadFixedLenValueDex>(value);
  EXPECT_EQ(value_dex->FieldType(), boolean());
  EXPECT_EQ(pair->validity_exprs().size(), 1);
  auto validity = pair->validity_exprs().at(0);
  auto validity_dex = std::dynamic_pointer_cast<VectorReadValidityDex>(validity);
  EXPECT_NE(validity_dex->ValidityIdx(), value_dex->DataIdx());
}

借助函数重载,使用访问者模式，实现树的遍历与转换

class GANDIVA_EXPORT TreeExprBuilder {
 public:
  /// \brief create a node on a literal.
  static NodePtr MakeLiteral(bool value);
  static NodePtr MakeLiteral(uint8_t value);
  static NodePtr MakeLiteral(uint16_t value);
  static NodePtr MakeLiteral(uint32_t value);
  static NodePtr MakeLiteral(uint64_t value);
  static NodePtr MakeLiteral(int8_t value);
  static NodePtr MakeLiteral(int16_t value);
  static NodePtr MakeLiteral(int32_t value);
  static NodePtr MakeLiteral(int64_t value);
  static NodePtr MakeLiteral(float value);
  static NodePtr MakeLiteral(double value);
  static NodePtr MakeStringLiteral(const std::string& value);
  static NodePtr MakeBinaryLiteral(const std::string& value);
  static NodePtr MakeDecimalLiteral(const DecimalScalar128& value);

to_date_holder

完成字符串往时间的转化

EST_F(TestToDateHolder, TestSimpleDateTime) {
  EXPECT_OK_AND_ASSIGN(auto to_date_holder, ToDateHolder::Make("YYYY-MM-DD HH:MI:SS", 1));
  auto& to_date = *to_date_holder;
  bool out_valid;
  std::string s("1986-12-01 01:01:01");
  int64_t millis_since_epoch =
      to_date(&execution_context_, s.data(), (int)s.length(), true, &out_valid);
  EXPECT_EQ(millis_since_epoch, 533779200000);
  s = std::string("1986-12-01 01:01:01.11");
  millis_since_epoch =
      to_date(&execution_context_, s.data(), (int)s.length(), true, &out_valid);
  EXPECT_EQ(millis_since_epoch, 533779200000);
  s = std::string("1986-12-01 01:01:01 +0800");
  millis_since_epoch =
      to_date(&execution_context_, s.data(), (int)s.length(), true, &out_valid);
  EXPECT_EQ(millis_since_epoch, 533779200000);
#if 0
  // TODO : this fails parsing with date::parse and strptime on linux
  s = std::string("1886-12-01 00:00:00");
  millis_since_epoch =
      to_date(&execution_context_, s.data(), (int) s.length(), true, &out_valid);
  EXPECT_EQ(out_valid, true);
  EXPECT_EQ(millis_since_epoch, -2621894400000);
#endif
  s = std::string("1886-12-01 01:01:01");
  millis_since_epoch =
      to_date(&execution_context_, s.data(), (int)s.length(), true, &out_valid);
  EXPECT_EQ(millis_since_epoch, -2621894400000);
  s = std::string("1986-12-11 01:30:00");
  millis_since_epoch =
      to_date(&execution_context_, s.data(), (int)s.length(), true, &out_valid);
  EXPECT_EQ(millis_since_epoch, 534643200000);
}

simple_arena

没太理解内容，似乎是关于内存分配处理的内容，实现以Trunk为单位的内存分配

TEST_F(TestSimpleArena, TestAlloc) {
  int64_t chunk_size = 4096;
  SimpleArena arena(arrow::default_memory_pool(), chunk_size);
  // Small allocations should come from the same chunk.
  int64_t small_size = 100;
  for (int64_t i = 0; i < 20; ++i) {
    auto p = arena.Allocate(small_size);
    EXPECT_NE(p, nullptr);
    EXPECT_EQ(arena.total_bytes(), chunk_size);
    EXPECT_EQ(arena.avail_bytes(), chunk_size - (i + 1) * small_size);
  }
  // large allocations require separate chunks
  int64_t large_size = 100 * chunk_size;
  auto p = arena.Allocate(large_size);
  EXPECT_NE(p, nullptr);
  EXPECT_EQ(arena.total_bytes(), chunk_size + large_size);
  EXPECT_EQ(arena.avail_bytes(), 0);
}

selection_vector

实现对于Arrow格式存储的选择向量（Selection Vector）

这里需要补充下关于选择向量的相关知识

Selection Vector 是一种在数据处理系统中使用的技术，用来表示一批数据中哪些行被选中（有效），从而避免对不相关的数据行进行操作。它常见于列式数据库、矢量化执行引擎（如 Apache Arrow、Dremio、Gandiva）中，用于提升性能。

Selection Vector（选择向量）本质上是一个索引数组，存储的是被选中行在原始数据批中的下标。

避免复制数据：只需操作向量而不移动原始数据。

高效过滤：可以快速跳过不符合条件的行。

矢量化执行支持：配合批处理（batch processing），提升 SIMD 性能。

落到具体选择上，可能就是bitmap或是个Set

TEST_F(TestSelectionVector, TestInt16Set) {
  int max_slots = 10;
  std::shared_ptr<SelectionVector> selection;
  auto status = SelectionVector::MakeInt16(max_slots, pool_, &selection);
  EXPECT_EQ(status.ok(), true) << status.message();
  selection->SetIndex(0, 100);
  EXPECT_EQ(selection->GetIndex(0), 100);
  selection->SetIndex(1, 200);
  EXPECT_EQ(selection->GetIndex(1), 200);
  selection->SetNumSlots(2);
  EXPECT_EQ(selection->GetNumSlots(), 2);
  // TopArray() should return an array with 100,200
  auto array_raw = selection->ToArray();
  const auto& array = dynamic_cast<const arrow::UInt16Array&>(*array_raw);
  EXPECT_EQ(array.length(), 2) << array_raw->ToString();
  EXPECT_EQ(array.Value(0), 100) << array_raw->ToString();
  EXPECT_EQ(array.Value(1), 200) << array_raw->ToString();
}

也可以通过Bitmap实现向量选择

TEST_F(TestSelectionVector, TestInt64PopulateFromBitMap) {
  int max_slots = 200;
  std::shared_ptr<SelectionVector> selection;
  auto status = SelectionVector::MakeInt64(max_slots, pool_, &selection);
  EXPECT_EQ(status.ok(), true) << status.message();
  int bitmap_size = RoundUpNumi64(max_slots) * 8;
  std::vector<uint8_t> bitmap(bitmap_size);
  arrow::bit_util::SetBit(&bitmap[0], 0);
  arrow::bit_util::SetBit(&bitmap[0], 5);
  arrow::bit_util::SetBit(&bitmap[0], 121);
  arrow::bit_util::SetBit(&bitmap[0], 220);
  status = selection->PopulateFromBitMap(&bitmap[0], bitmap_size, max_slots - 1);
  EXPECT_EQ(status.ok(), true) << status.message();
  EXPECT_EQ(selection->GetNumSlots(), 3);
  EXPECT_EQ(selection->GetIndex(0), 0);
  EXPECT_EQ(selection->GetIndex(1), 5);
  EXPECT_EQ(selection->GetIndex(2), 121);
}

regex_functions/util

正则表达式相关，似乎能检测SQL相关的符号，这部分使用了Google的re2库，参考PCRE（Perl Compatible Regular Expressions）实现标准

const std::set<char> RegexUtil::pcre_regex_specials_ = {
    '[', ']', '(', ')', '|', '^', '-', '+', '*', '?', '{', '}', '$', '\\', '.'};

而测试也基本围绕些简易字符串展开

你甚至能看到关于中文字符的检测，这可太稀罕了，C++的UTF-8识别这块我一直摸不着头脑😂

  input_string = "路%c$大";
  extract_index = 2;  // Retrieve all matched string
  ret = extract_numbers(&execution_context_, input_string.c_str(),
                        static_cast<int32_t>(input_string.length()), extract_index,
                        &out_length);
  ret_as_str = std::string(ret, out_length);
  EXPECT_EQ(out_length, 1);
  EXPECT_EQ(ret_as_str, "c");

random_generator

随机数生成器，里面包含了随机种子信息

namespace gandiva {
/// Function Holder for 'random'
class GANDIVA_EXPORT RandomGeneratorHolder : public FunctionHolder {
 public:
  ~RandomGeneratorHolder() override = default;
  static Result<std::shared_ptr<RandomGeneratorHolder>> Make(const FunctionNode& node);
  double operator()() { return distribution_(generator_); }
 private:
  explicit RandomGeneratorHolder(int seed) : distribution_(0, 1) {
    int64_t seed64 = static_cast<int64_t>(seed);
    seed64 = (seed64 ^ 0x00000005DEECE66D) & 0x0000ffffffffffff;
    generator_.seed(static_cast<uint64_t>(seed64));
  }
  RandomGeneratorHolder() : distribution_(0, 1) {
    generator_.seed(::arrow::internal::GetRandomSeed());
  }
  std::mt19937_64 generator_;
  std::uniform_real_distribution<> distribution_;
};
}  // namespace gandiva

project

关于Gandiva如何处理Apache Arrow的Project的代码了，

/// \brief projection using expressions.

///

/// A projector is built for a specific schema and vector of expressions.

/// Once the projector is built, it can be used to evaluate many row batches.

看以看到实现中LLVM Generator，output_fields，是否使用已有的缓存，以及代码生成设置相关属性

  std::unique_ptr<LLVMGenerator> llvm_generator_;
  SchemaPtr schema_;
  FieldVector output_fields_;
  std::shared_ptr<Configuration> configuration_;
  bool built_from_cache_;
};

这里面还涉及了关于数据缓冲区的代码

Status Projector::AllocArrayData(const DataTypePtr& type, int64_t num_records,
                                 arrow::MemoryPool* pool,
                                 ArrayDataPtr* array_data) const {
  arrow::Status astatus;
  std::vector<std::shared_ptr<arrow::Buffer>> buffers;
  // The output vector always has a null bitmap.
  int64_t size = arrow::bit_util::BytesForBits(num_records);
  ARROW_ASSIGN_OR_RAISE(auto bitmap_buffer, arrow::AllocateBuffer(size, pool));
  buffers.push_back(std::move(bitmap_buffer));
  // String/Binary vectors have an offsets array.
  auto type_id = type->id();
  if (arrow::is_binary_like(type_id)) {
    auto offsets_len = arrow::bit_util::BytesForBits((num_records + 1) * 32);
    ARROW_ASSIGN_OR_RAISE(auto offsets_buffer, arrow::AllocateBuffer(offsets_len, pool));
    buffers.push_back(std::move(offsets_buffer));
  }
  // The output vector always has a data array.
  int64_t data_len;
  if (arrow::is_primitive(type_id) || type_id == arrow::Type::DECIMAL) {
    const auto& fw_type = static_cast<const arrow::FixedWidthType&>(*type);
    data_len = arrow::bit_util::BytesForBits(num_records * fw_type.bit_width());
  } else if (arrow::is_binary_like(type_id)) {
    // we don't know the expected size for varlen output vectors.
    data_len = 0;
  } else {
    return Status::Invalid("Unsupported output data type " + type->ToString());
  }
  ARROW_ASSIGN_OR_RAISE(auto data_buffer, arrow::AllocateResizableBuffer(data_len, pool));
  // This is not strictly required but valgrind gets confused and detects this
  // as uninitialized memory access. See arrow::util::SetBitTo().
  if (type->id() == arrow::Type::BOOL) {
    memset(data_buffer->mutable_data(), 0, data_len);
  }
  buffers.push_back(std::move(data_buffer));
  *array_data = arrow::ArrayData::Make(type, num_records, std::move(buffers));
  return Status::OK();
}

有点奇怪的是这部分内容没有没有配备test

lru_cache

从Boost库修改的LRU Cache，因为代码使用了模板，所以这里看不出来是存了什么

// modified from boost LRU cache -> the boost cache supported only an
// ordered map.
namespace gandiva {
// a cache which evicts the least recently used item when it is full
template <class Key, class Value>
class LruCache {
 public:
  using key_type = Key;
  using value_type = Value;
  using list_type = std::list<key_type>;

测试代码是直接使用string

TEST_F(TestLruCache, TestLruBehavior) {
  cache_.insert(TestCacheKey(1), "hello");
  cache_.insert(TestCacheKey(2), "hello");
  cache_.get(TestCacheKey(1));
  cache_.insert(TestCacheKey(3), "hello");
  // should have evicted key 2.
  ASSERT_EQ(*cache_.get(TestCacheKey(1)), "hello");
}

llvm_types

有一个llvm_types用于全局的types生成管理，用于映射Arrow的类型，这样的代码也能在NoisePage里面找到

class GANDIVA_EXPORT LLVMTypes {
 public:
  explicit LLVMTypes(llvm::LLVMContext& context);
  llvm::Type* void_type() { return llvm::Type::getVoidTy(context_); }
  llvm::Type* i1_type() { return llvm::Type::getInt1Ty(context_); }
  llvm::Type* i8_type() { return llvm::Type::getInt8Ty(context_); }
  llvm::Type* i16_type() { return llvm::Type::getInt16Ty(context_); }
  llvm::Type* i32_type() { return llvm::Type::getInt32Ty(context_); }
  llvm::Type* i64_type() { return llvm::Type::getInt64Ty(context_); }
  llvm::Type* i128_type() { return llvm::Type::getInt128Ty(context_); }
  llvm::StructType* i128_split_type() {
    // struct with high/low bits (see decimal_ops.cc:DecimalSplit)
    return llvm::StructType::get(context_, {i64_type(), i64_type()}, false);
  }

以及一些简单的内容初始化

  llvm::Constant* i128_zero() { return i128_constant(0); }
  llvm::Constant* i128_one() { return i128_constant(1); }

llvm_includes

开头的关闭MSVC的警告可以记录以下，这是我头一回遇到，看以看出Gandiva是能在Windows上面运行的

#if defined(_MSC_VER)
#  pragma warning(push)
#  pragma warning(disable : 4141)
#  pragma warning(disable : 4146)
#  pragma warning(disable : 4244)
#  pragma warning(disable : 4267)
#  pragma warning(disable : 4291)
#  pragma warning(disable : 4624)
#endif

甚至还考虑到了不同LLVM版本的情况

#if LLVM_VERSION_MAJOR >= 10
#  define LLVM_ALIGN(alignment) (llvm::Align((alignment)))
#else
#  define LLVM_ALIGN(alignment) (alignment)
#endif

llvm_generator

最为核心的LLVM代码生成

生成器似乎可以对缓存有效利用

class GANDIVA_EXPORT LLVMGenerator {
 public:
  /// \brief Factory method to initialize the generator.
  static Result<std::unique_ptr<LLVMGenerator>> Make(
      const std::shared_ptr<Configuration>& config, bool cached,
      std::optional<std::reference_wrapper<GandivaObjectCache>> object_cache =
          std::nullopt);
  /// \brief Get the cache to be used for LLVM ObjectCache.
  static std::shared_ptr<Cache<ExpressionCacheKey, std::shared_ptr<llvm::MemoryBuffer>>>
  GetCache();

存储关于SelectionVector：：Mode的信息

SelectionVector::Mode selection_vector_mode() { return selection_vector_mode_; }

build将表达式输入生成代码

  /// \brief Build the code for the expression trees for default mode with a LLVM
  /// ObjectCache. Each element in the vector represents an expression tree
  Status Build(const ExpressionVector& exprs, SelectionVector::Mode mode);
  /// \brief Build the code for the expression trees for default mode. Each
  /// element in the vector represents an expression tree
  Status Build(const ExpressionVector& exprs);

execute将Arrow量输入LLVM IR函数

  /// \brief Execute the built expression against the provided arguments for
  /// default mode.
  Status Execute(const arrow::RecordBatch& record_batch,
                 const ArrayDataVector& output_vector) const;
  /// \brief Execute the built expression against the provided arguments for
  /// all modes. Only works on the records specified in the selection_vector.
  Status Execute(const arrow::RecordBatch& record_batch,
                 const SelectionVector* selection_vector,
                 const ArrayDataVector& output_vector) const;

基本LLVMContext和IRbuilder自然是少不了，但这里的创建Global String居然不用检查重复，不知道是疏忽，还是因为前边有检查😂

  llvm::LLVMContext* context() { return engine_->context(); }
  llvm::IRBuilder<>* ir_builder() { return engine_->ir_builder(); }
  llvm::Constant* CreateGlobalStringPtr(const std::string& string) {
    return engine_->CreateGlobalStringPtr(string);
  }

然后Vistor模式重新过一遍解析树

 class Visitor : public DexVisitor {
   public:
    Visitor(LLVMGenerator* generator, llvm::Function* function,
            llvm::BasicBlock* entry_block, llvm::Value* arg_addrs,
            llvm::Value* arg_local_bitmaps, llvm::Value* arg_holder_ptrs,
            std::vector<llvm::Value*> slice_offsets, llvm::Value* arg_context_ptr,
            llvm::Value* loop_var);
    void Visit(const VectorReadValidityDex& dex) override;
    void Visit(const VectorReadFixedLenValueDex& dex) override;
    void Visit(const VectorReadVarLenValueDex& dex) override;
    void Visit(const LocalBitMapValidityDex& dex) override;
    void Visit(const TrueDex& dex) override;
    void Visit(const FalseDex& dex) override;
    void Visit(const LiteralDex& dex) override;
    void Visit(const NonNullableFuncDex& dex) override;
    void Visit(const NullableNeverFuncDex& dex) override;
    void Visit(const NullableInternalFuncDex& dex) override;
    void Visit(const IfDex& dex) override;
    void Visit(const BooleanAndDex& dex) override;
    void Visit(const BooleanOrDex& dex) override;
    void Visit(const InExprDexBase<int32_t>& dex) override;
    void Visit(const InExprDexBase<int64_t>& dex) override;
    void Visit(const InExprDexBase<float>& dex) override;
    void Visit(const InExprDexBase<double>& dex) override;
    void Visit(const InExprDexBase<gandiva::DecimalScalar128>& dex) override;
    void Visit(const InExprDexBase<std::string>& dex) override;
    template <typename Type>
    void VisitInExpression(const InExprDexBase<Type>& dex);
    LValuePtr result() { return result_; }
    bool has_arena_allocs() { return has_arena_allocs_; }

还有专门关于LLVM函数生成与函数调用的函数

    std::vector<llvm::Value*> BuildParams(int holder_idx,
                                          const ValueValidityPairVector& args,
                                          bool with_validity, bool with_context);
    // Generate code to invoke a function call.
    LValuePtr BuildFunctionCall(const NativeFunction* func, DataTypePtr arrow_return_type,
                                std::vector<llvm::Value*>* params);
    // Generate code for an if-else condition.
    LValuePtr BuildIfElse(llvm::Value* condition, std::function<LValuePtr()> then_func,
                          std::function<LValuePtr()> else_func,
                          DataTypePtr arrow_return_type);

通过接口添加预定义的LLVM IR函数

  /// Generate code to make a function call (to a pre-compiled IR function) which takes
  /// 'args' and has a return type 'ret_type'.
  llvm::Value* AddFunctionCall(const std::string& full_name, llvm::Type* ret_type,
                               const std::vector<llvm::Value*>& args);

关于Cache的详细实现

std::shared_ptr<Cache<ExpressionCacheKey, std::shared_ptr<llvm::MemoryBuffer>>>
LLVMGenerator::GetCache() {
  static std::shared_ptr<Cache<ExpressionCacheKey, std::shared_ptr<llvm::MemoryBuffer>>>
      shared_cache = std::make_shared<
          Cache<ExpressionCacheKey, std::shared_ptr<llvm::MemoryBuffer>>>();
  return shared_cache;
}
Status LLVMGenerator::SetLLVMObjectCache(GandivaObjectCache& object_cache) {
  return engine_->SetLLVMObjectCache(object_cache);
}

build的部分实现

Status LLVMGenerator::Build(const ExpressionVector& exprs, SelectionVector::Mode mode) {
  selection_vector_mode_ = mode;
  for (auto& expr : exprs) {
    auto output = annotator_.AddOutputFieldDescriptor(expr->result());
    ARROW_RETURN_NOT_OK(Add(expr, output));
  }
  // Compile and inject into the process' memory the generated function.
  ARROW_RETURN_NOT_OK(engine_->FinalizeModule());
  // setup the jit functions for each expression.
  for (auto& compiled_expr : compiled_exprs_) {
    auto fn_name = compiled_expr->GetFunctionName(mode);
    ARROW_ASSIGN_OR_RAISE(auto fn_ptr, engine_->CompiledFunction(fn_name));
    auto jit_fn = reinterpret_cast<EvalFunc>(fn_ptr);
    compiled_expr->SetJITFunction(selection_vector_mode_, jit_fn);
  }
  return Status::OK();
}

这部分的详细内容有空的话值得细看，而关于Test的话，这边给的示范样例是LLVM自动向量化向量加

TEST_F(TestLLVMGenerator, TestAdd) {
  // Setup LLVM generator to do an arithmetic add of two vectors
  ASSERT_OK_AND_ASSIGN(auto generator,
                       LLVMGenerator::Make(TestConfigWithIrDumping(), false));
  Annotator annotator;
  auto field0 = std::make_shared<arrow::Field>("f0", arrow::int32());
  auto desc0 = annotator.CheckAndAddInputFieldDescriptor(field0);
  auto validity_dex0 = std::make_shared<VectorReadValidityDex>(desc0);
  auto value_dex0 = std::make_shared<VectorReadFixedLenValueDex>(desc0);
  auto pair0 = std::make_shared<ValueValidityPair>(validity_dex0, value_dex0);
  auto field1 = std::make_shared<arrow::Field>("f1", arrow::int32());
  auto desc1 = annotator.CheckAndAddInputFieldDescriptor(field1);
  auto validity_dex1 = std::make_shared<VectorReadValidityDex>(desc1);
  auto value_dex1 = std::make_shared<VectorReadFixedLenValueDex>(desc1);
  auto pair1 = std::make_shared<ValueValidityPair>(validity_dex1, value_dex1);
  DataTypeVector params{arrow::int32(), arrow::int32()};
  auto func_desc = std::make_shared<FuncDescriptor>("add", params, arrow::int32());
  FunctionSignature signature(func_desc->name(), func_desc->params(),
                              func_desc->return_type());
  const NativeFunction* native_func =
      generator->function_registry_->LookupSignature(signature);
  std::vector<ValueValidityPairPtr> pairs{pair0, pair1};
  auto func_dex = std::make_shared<NonNullableFuncDex>(
      func_desc, native_func, FunctionHolderPtr(nullptr), -1, pairs);
  auto field_sum = std::make_shared<arrow::Field>("out", arrow::int32());
  auto desc_sum = annotator.CheckAndAddInputFieldDescriptor(field_sum);
  // LLVM 10 doesn't like the expr function name to be the same as the module name when
  // LLJIT is used
  std::string fn_name = "llvm_gen_test_add_expr";
  ASSERT_OK(generator->engine_->LoadFunctionIRs());
  ASSERT_OK(generator->CodeGenExprValue(func_dex, 4, desc_sum, 0, fn_name,
                                        SelectionVector::MODE_NONE));
  ASSERT_OK(generator->engine_->FinalizeModule());
  auto const& ir = generator->engine_->ir();
  EXPECT_THAT(ir, testing::HasSubstr("vector.body"));
  ASSERT_OK_AND_ASSIGN(auto fn_ptr, generator->engine_->CompiledFunction(fn_name));
  ASSERT_TRUE(fn_ptr);
  auto eval_func = reinterpret_cast<EvalFunc>(fn_ptr);
  constexpr size_t kNumRecords = 4;
  std::array<uint32_t, kNumRecords> a0{1, 2, 3, 4};
  std::array<uint32_t, kNumRecords> a1{5, 6, 7, 8};
  uint64_t in_bitmap = 0xffffffffffffffffull;
  std::array<uint32_t, kNumRecords> out{0, 0, 0, 0};
  uint64_t out_bitmap = 0;
  std::array<uint8_t*, 6> addrs{
      reinterpret_cast<uint8_t*>(a0.data()),  reinterpret_cast<uint8_t*>(&in_bitmap),
      reinterpret_cast<uint8_t*>(a1.data()),  reinterpret_cast<uint8_t*>(&in_bitmap),
      reinterpret_cast<uint8_t*>(out.data()), reinterpret_cast<uint8_t*>(&out_bitmap),
  };
  std::array<int64_t, 6> addr_offsets{0, 0, 0, 0, 0, 0};
  eval_func(addrs.data(), addr_offsets.data(), nullptr, nullptr, nullptr,
            0 /* dummy context ptr */, kNumRecords);
  EXPECT_THAT(out, testing::ElementsAre(6, 8, 10, 12));
  EXPECT_EQ(out_bitmap, 0ULL);
}

这一块可以看下GPT的解析

这个 TestAdd 测试用例是在测试 LLVM 代码生成器生成向量加法运算的完整流程。让我详细解释：

测试目标

验证 Gandiva 能否正确生成并执行两个 int32 向量相加的 LLVM 代码。

测试流程分析

1. 初始化 LLVM 生成器
ASSERT_OK_AND_ASSIGN(auto generator, LLVMGenerator::Make(TestConfigWithIrDumping(), false));
创建支持 IR 转储的 LLVM 代码生成器

2. 设置输入向量描述符
// 第一个输入向量 f0 (int32)
auto field0 = std::make_shared<arrow::Field>("f0", arrow::int32());
auto desc0 = annotator.CheckAndAddInputFieldDescriptor(field0);
auto validity_dex0 = std::make_shared<VectorReadValidityDex>(desc0);  // 有效性位图
auto value_dex0 = std::make_shared<VectorReadFixedLenValueDex>(desc0); // 数据值
auto pair0 = std::make_shared<ValueValidityPair>(validity_dex0, value_dex0);
// 第二个输入向量 f1 (int32) - 类似的设置
3. 定义加法函数
DataTypeVector params{arrow::int32(), arrow::int32()};
auto func_desc = std::make_shared<FuncDescriptor>("add", params, arrow::int32());
定义一个接受两个 int32 参数、返回 int32 的加法函数

4. 创建函数表达式
auto func_dex = std::make_shared<NonNullableFuncDex>(
    func_desc, native_func, FunctionHolderPtr(nullptr), -1, pairs);
将输入向量对与加法函数绑定

5. 生成 LLVM 代码
ASSERT_OK(generator->CodeGenExprValue(func_dex, 4, desc_sum, 0, fn_name,
                                      SelectionVector::MODE_NONE));
为表达式生成 LLVM IR 代码

处理 4 条记录

6. 编译并获取函数指针
ASSERT_OK_AND_ASSIGN(auto fn_ptr, generator->engine_->CompiledFunction(fn_name));
auto eval_func = reinterpret_cast<EvalFunc>(fn_ptr);
将 LLVM IR 编译为机器码

获取可执行的函数指针

7. 准备测试数据
std::array<uint32_t, kNumRecords> a0{1, 2, 3, 4};  // 第一个向量
std::array<uint32_t, kNumRecords> a1{5, 6, 7, 8};  // 第二个向量
uint64_t in_bitmap = 0xffffffffffffffffull;        // 全部有效
std::array<uint32_t, kNumRecords> out{0, 0, 0, 0}; // 输出向量
8. 执行生成的函数
eval_func(addrs.data(), addr_offsets.data(), nullptr, nullptr, nullptr,
          0 /* dummy context ptr */, kNumRecords);
调用编译后的函数执行向量加法

9. 验证结果
EXPECT_THAT(out, testing::ElementsAre(6, 8, 10, 12));  // 1+5, 2+6, 3+7, 4+8
EXPECT_EQ(out_bitmap, 0ULL);  // 输出位图应为0（全部有效）
关键技术点

1. 向量化优化验证
EXPECT_THAT(ir, testing::HasSubstr("vector.body"));
检查生成的 LLVM IR 中包含向量化循环体

确保 LLVM 进行了向量化优化

2. 内存布局
std::array<uint8_t*, 6> addrs{
    reinterpret_cast<uint8_t*>(a0.data()),  // 第一个向量数据
    reinterpret_cast<uint8_t*>(&in_bitmap), // 第一个向量有效性位图
    reinterpret_cast<uint8_t*>(a1.data()),  // 第二个向量数据
    reinterpret_cast<uint8_t*>(&in_bitmap), // 第二个向量有效性位图
    reinterpret_cast<uint8_t*>(out.data()), // 输出向量数据
    reinterpret_cast<uint8_t*>(&out_bitmap) // 输出向量有效性位图
};
测试意义

这个测试验证了 Gandiva 的核心功能：

表达式解析：将高级表达式转换为内部表示

代码生成：生成高效的 LLVM IR 代码

编译执行：将 IR 编译为机器码并执行

向量化优化：确保生成的代码利用了 SIMD 指令

正确性：验证计算结果的准确性

这是一个端到端的集成测试，确保整个代码生成和执行流水线工作正常。

这里的testing::HasSubstr是GMock的组件

这里可以看到，C语言函数是可以直接register上去的

TEST_F(TestLLVMGenerator, VerifyExtendedCFunctions) {
  VerifyFunctionMapping("multiply_by_three_int32", [](auto registry) {
    return TestConfigWithCFunction(std::move(registry));
  });
//test_util.cc
std::shared_ptr<Configuration> TestConfigWithCFunction(
    std::shared_ptr<FunctionRegistry> registry) {
  return BuildConfigurationWithRegistry(std::move(registry), [](auto reg) {
    return reg->Register(GetTestExternalCFunction(),
                         reinterpret_cast<void*>(multiply_by_three));
  });
}
static int64_t multiply_by_three(int32_t value) { return value * 3; }

literal_holder

Gandiva 中统一表示和处理各种类型的常量值

namespace gandiva {
using LiteralHolder =
    std::variant<bool, float, double, int8_t, int16_t, int32_t, int64_t, uint8_t,
                 uint16_t, uint32_t, uint64_t, std::string, DecimalScalar128>;
GANDIVA_EXPORT std::string ToString(const LiteralHolder& holder);
}  // namespace gandiva

std::variant 是 C++17 引入的一个类型安全的联合体（type-safe union），它可以在运行时保存一个多个预设类型中的一个值，但不会像传统的 union 那样不安全。

Rust 的 enum 枚举类型是 std::variant 的更强版本

Interval_holder

处理各类时间间隔

  // Pass only years and days to cast
  data = "P12Y15D";
  response = cast_interval_day(&execution_context_, data.data(), 7, true, &out_valid);
  qty_days_in_response = 15;
  qty_millis_in_response = 0;
  EXPECT_TRUE(out_valid);
  EXPECT_FALSE(execution_context_.has_error());
  EXPECT_EQ(response, (qty_millis_in_response << 32) | qty_days_in_response);

hash_utils

hash组件用的是OpenSSL，主要是关于Sha类，Md5l类函数

GANDIVA_EXPORT
const char* gdv_sha512_hash(int64_t context, const void* message, size_t message_length,
                            int32_t* out_length) {
  constexpr int sha512_result_length = 128;
  return gdv_hash_using_openssl(context, message, message_length, EVP_sha512(),
                                sha512_result_length, out_length);
}
/// Hashes a generic message using the SHA256 algorithm
GANDIVA_EXPORT
const char* gdv_sha256_hash(int64_t context, const void* message, size_t message_length,
                            int32_t* out_length) {
  constexpr int sha256_result_length = 64;
  return gdv_hash_using_openssl(context, message, message_length, EVP_sha256(),
                                sha256_result_length, out_length);
}
/// Hashes a generic message using the SHA1 algorithm
GANDIVA_EXPORT
const char* gdv_sha1_hash(int64_t context, const void* message, size_t message_length,
                          int32_t* out_length) {
  constexpr int sha1_result_length = 40;
  return gdv_hash_using_openssl(context, message, message_length, EVP_sha1(),
                                sha1_result_length, out_length);
}
GANDIVA_EXPORT
const char* gdv_md5_hash(int64_t context, const void* message, size_t message_length,
                         int32_t* out_length) {
  constexpr int md5_result_length = 32;
  return gdv_hash_using_openssl(context, message, message_length, EVP_md5(),
                                md5_result_length, out_length);
}

gandiva_object_cache

直接对result1 = evaluate("column1 + column2 * 3");这类操作的结果进行缓存，相关操作继承自llvm::ObjectCache，使用llvm::memorybuffer缓存相关代码

class GandivaObjectCache : public llvm::ObjectCache {
 public:
  explicit GandivaObjectCache(
      std::shared_ptr<Cache<ExpressionCacheKey, std::shared_ptr<llvm::MemoryBuffer>>>&
          cache,
      ExpressionCacheKey key);
  ~GandivaObjectCache() {}
  void notifyObjectCompiled(const llvm::Module* M, llvm::MemoryBufferRef Obj);
  std::unique_ptr<llvm::MemoryBuffer> getObject(const llvm::Module* M);
 private:
  ExpressionCacheKey cache_key_;
  std::shared_ptr<Cache<ExpressionCacheKey, std::shared_ptr<llvm::MemoryBuffer>>> cache_;
};

function_signature

给函数上Hash，我猜应该是缓存记录

  EXPECT_EQ(FunctionSignature("extract_month", {arrow::date32()}, arrow::int64()),
            FunctionSignature("extract_month", {local_date32_type_}, local_i64_type_));
TEST_F(TestFunctionSignature, TestHash) {
  FunctionSignature f1("add", {arrow::int32(), arrow::int32()}, arrow::int64());
  FunctionSignature f2("add", {local_i32_type_, local_i32_type_}, local_i64_type_);
  EXPECT_EQ(f1.Hash(), f2.Hash());
  FunctionSignature f3("extractDay", {arrow::int64()}, arrow::int64());
  FunctionSignature f4("extractday", {arrow::int64()}, arrow::int64());
  EXPECT_EQ(f3.Hash(), f4.Hash());
}

function_register

class GANDIVA_EXPORT FunctionRegistry {
 public:
  using iterator = const NativeFunction*;
  using FunctionHolderMaker =
      std::function<arrow::Result<std::shared_ptr<FunctionHolder>>(
          const FunctionNode& function_node)>;
  FunctionRegistry();
  FunctionRegistry(const FunctionRegistry&) = delete;
  FunctionRegistry& operator=(const FunctionRegistry&) = delete;
  /// Lookup a pre-compiled function by its signature.
  const NativeFunction* LookupSignature(const FunctionSignature& signature) const;
  /// \brief register a set of functions into the function registry from a given bitcode
  /// file
  arrow::Status Register(const std::vector<NativeFunction>& funcs,
                         const std::string& bitcode_path);
  /// \brief register a set of functions into the function registry from a given bitcode
  /// buffer
  arrow::Status Register(const std::vector<NativeFunction>& funcs,
                         std::shared_ptr<arrow::Buffer> bitcode_buffer);
  /// \brief register a C function into the function registry
  /// @param func the registered function's metadata
  /// @param c_function_ptr the function pointer to the
  /// registered function's implementation
  /// @param function_holder_maker this will be used as the function holder if the
  /// function requires a function holder
  arrow::Status Register(
      NativeFunction func, void* c_function_ptr,
      std::optional<FunctionHolderMaker> function_holder_maker = std::nullopt);
  /// \brief get a list of bitcode memory buffers saved in the registry
  const std::vector<std::shared_ptr<arrow::Buffer>>& GetBitcodeBuffers() const;
  /// \brief get a list of C functions saved in the registry
  const std::vector<std::pair<NativeFunction, void*>>& GetCFunctions() const;
  const FunctionHolderMakerRegistry& GetFunctionHolderMakerRegistry() const;
  iterator begin() const;
  iterator end() const;
  iterator back() const;
  friend arrow::Result<std::shared_ptr<FunctionRegistry>> MakeDefaultFunctionRegistry();
 private:
  std::vector<NativeFunction> pc_registry_;
  SignatureMap pc_registry_map_;
  std::vector<std::shared_ptr<arrow::Buffer>> bitcode_memory_buffers_;
  std::vector<std::pair<NativeFunction, void*>> c_functions_;
  FunctionHolderMakerRegistry holder_maker_registry_;
  Status Add(NativeFunction func);
};
/// \brief get the default function registry
GANDIVA_EXPORT std::shared_ptr<FunctionRegistry> default_function_registry();
}  // namespace gandiva

function_ir_builder

一个十分通用的IR生成器（这玩意我怎么之前没想到过呢.jpg)，甚至能实现If-else的block块跳转

class FunctionIRBuilder {
 public:
  explicit FunctionIRBuilder(Engine* engine) : engine_(engine) {}
  virtual ~FunctionIRBuilder() = default;
 protected:
  LLVMTypes* types() { return engine_->types(); }
  llvm::Module* module() { return engine_->module(); }
  llvm::LLVMContext* context() { return engine_->context(); }
  llvm::IRBuilder<>* ir_builder() { return engine_->ir_builder(); }
  llvm::Constant* CreateGlobalStringPtr(const std::string& string) {
    return engine_->CreateGlobalStringPtr(string);
  }
  /// Build an if-else block.
  llvm::Value* BuildIfElse(llvm::Value* condition, llvm::Type* return_type,
                           std::function<llvm::Value*()> then_func,
                           std::function<llvm::Value*()> else_func);
  struct NamedArg {
    std::string name;
    llvm::Type* type;
  };
  /// Build llvm fn.
  llvm::Function* BuildFunction(const std::string& function_name, llvm::Type* return_type,
                                std::vector<NamedArg> in_args);
 private:
  Engine* engine_;
};

filter

这部分也是在LLVM中实现，看起来和Project差不多

 private:
  std::unique_ptr<LLVMGenerator> llvm_generator_;
  SchemaPtr schema_;
  std::shared_ptr<Configuration> configuration_;
  bool built_from_cache_;

如果想要添加缓存，直接SetLLVMObjectCache即可

Status Engine::SetLLVMObjectCache(GandivaObjectCache& object_cache) {
  auto cached_buffer = object_cache.getObject(nullptr);
  if (cached_buffer) {
    auto error = lljit_->addObjectFile(std::move(cached_buffer));
    if (error) {
      return Status::CodeGenError("Failed to add cached object file to LLJIT: ",
                                  llvm::toString(std::move(error)));
    }
  }
  return Status::OK();
}

在PassManager里面可以挂上Optimize

static void OptimizeModuleWithNewPassManager(llvm::Module& module,
                                             llvm::TargetIRAnalysis target_analysis) {
  // Setup an optimiser pipeline
  llvm::PassBuilder pass_builder;
  llvm::LoopAnalysisManager loop_am;
  llvm::FunctionAnalysisManager function_am;
  llvm::CGSCCAnalysisManager cgscc_am;
  llvm::ModuleAnalysisManager module_am;
  function_am.registerPass([&] { return target_analysis; });
  // Register required analysis managers
  pass_builder.registerModuleAnalyses(module_am);
  pass_builder.registerCGSCCAnalyses(cgscc_am);
  pass_builder.registerFunctionAnalyses(function_am);
  pass_builder.registerLoopAnalyses(loop_am);
  pass_builder.crossRegisterProxies(loop_am, function_am, cgscc_am, module_am);
  pass_builder.registerPipelineStartEPCallback([&](llvm::ModulePassManager& module_pm,
                                                   llvm::OptimizationLevel Level) {
    module_pm.addPass(llvm::ModuleInlinerPass());
    llvm::FunctionPassManager function_pm;
    function_pm.addPass(llvm::InstCombinePass());
    function_pm.addPass(llvm::PromotePass());
    function_pm.addPass(llvm::GVNPass());
    function_pm.addPass(llvm::NewGVNPass());
    function_pm.addPass(llvm::SimplifyCFGPass());
    function_pm.addPass(llvm::LoopVectorizePass());
    function_pm.addPass(llvm::SLPVectorizerPass());
    module_pm.addPass(llvm::createModuleToFunctionPassAdaptor(std::move(function_pm)));
    module_pm.addPass(llvm::GlobalOptPass());
  });

engine

关于LLVM Engine的配置基本都在engine.h，engine.cc，engine_llvm_test.cc里面，还可以加载预编译好LLVM IR

 /// load pre-compiled IR modules from precompiled_bitcode.cc and merge them into
  /// the main module.
  Status LoadPreCompiledIR();
  // load external pre-compiled bitcodes into module
  Status LoadExternalPreCompiledIR();
  // Create and add mappings for cpp functions that can be accessed from LLVM.
  arrow::Status AddGlobalMappings();
  // Remove unused functions to reduce compile time.
  Status RemoveUnusedFunctions();
  std::unique_ptr<llvm::LLVMContext> context_;
  std::unique_ptr<llvm::orc::LLJIT> lljit_;
  std::unique_ptr<llvm::IRBuilder<>> ir_builder_;
  std::unique_ptr<llvm::Module> module_;
  LLVMTypes types_;
  std::vector<std::string> functions_to_compile_;
  bool optimize_ = true;
  bool module_finalized_ = false;
  bool cached_;
  bool functions_loaded_ = false;
  std::shared_ptr<FunctionRegistry> function_registry_;
  std::string module_ir_;
  std::unique_ptr<llvm::TargetMachine> target_machine_;
  const std::shared_ptr<Configuration> conf_;
};

encrypt

Gandiva里面有加密套件的相关设置（但是却没看到文档关于如何使用的），其使用的AES加密也来自OpenSSL组件

GANDIVA_EXPORT
int32_t aes_encrypt(const char* plaintext, int32_t plaintext_len, const char* key,
                    int32_t key_len, unsigned char* cipher);
/**
 * Decrypt data using aes algorithm
 **/
GANDIVA_EXPORT
int32_t aes_decrypt(const char* ciphertext, int32_t ciphertext_len, const char* key,
                    int32_t key_len, unsigned char* plaintext);

具体的Test

TEST(TestShaEncryptUtils, TestAesEncryptDecrypt) {
  // 16 bytes key
  auto* key = "12345678abcdefgh";
  auto* to_encrypt = "some test string";
  auto key_len = static_cast<int32_t>(strlen(reinterpret_cast<const char*>(key)));
  auto to_encrypt_len =
      static_cast<int32_t>(strlen(reinterpret_cast<const char*>(to_encrypt)));
  unsigned char cipher_1[64];
  int32_t cipher_1_len =
      gandiva::aes_encrypt(to_encrypt, to_encrypt_len, key, key_len, cipher_1);
  unsigned char decrypted_1[64];
  int32_t decrypted_1_len = gandiva::aes_decrypt(reinterpret_cast<const char*>(cipher_1),
                                                 cipher_1_len, key, key_len, decrypted_1);
  EXPECT_EQ(std::string(reinterpret_cast<const char*>(to_encrypt), to_encrypt_len),
            std::string(reinterpret_cast<const char*>(decrypted_1), decrypted_1_len));

decimal_ir

对于浮点数代码的生成进行了特别的处理，看来这里面坑不小😂

class DecimalIR : public FunctionIRBuilder {
 public:
  explicit DecimalIR(Engine* engine)
      : FunctionIRBuilder(engine), enable_ir_traces_(false) {}
  /// Build decimal IR functions and add them to the engine.
  static Status AddFunctions(Engine* engine);
  void EnableTraces() { enable_ir_traces_ = true; }
  llvm::Value* CallDecimalFunction(const std::string& function_name,
                                   llvm::Type* return_type,
                                   const std::vector<llvm::Value*>& args);
 private:
  /// The intrinsic fn for divide with small divisors is about 10x slower, so not
  /// using these.
  static const bool kUseOverflowIntrinsics = false;
  // Holder for an i128 value, along with its with scale and precision.
  class ValueFull {
   public:
    ValueFull(llvm::Value* value, llvm::Value* precision, llvm::Value* scale)
        : value_(value), precision_(precision), scale_(scale) {}
    llvm::Value* value() const { return value_; }
    llvm::Value* precision() const { return precision_; }
    llvm::Value* scale() const { return scale_; }
   private:
    llvm::Value* value_;
    llvm::Value* precision_;
    llvm::Value* scale_;
  };
  // Holder for an i128 value, and a boolean indicating overflow.
  class ValueWithOverflow {
   public:
    ValueWithOverflow(llvm::Value* value, llvm::Value* overflow)
        : value_(value), overflow_(overflow) {}
    // Make from IR struct
    static ValueWithOverflow MakeFromStruct(DecimalIR* decimal_ir, llvm::Value* dstruct);
    // Build a corresponding IR struct
    llvm::Value* AsStruct(DecimalIR* decimal_ir) const;
    llvm::Value* value() const { return value_; }
    llvm::Value* overflow() const { return overflow_; }
   private:
    llvm::Value* value_;
    llvm::Value* overflow_;
  };

附录：Arrow类型与LLVM类型的映射

Gandiva 类型（arrow 数据类型）	C 函数类型
int8	int8_t
int16	int16_t
int32	int32_t
int64	int64_t
uint8	uint8_t
uint16	uint16_t
uint32	uint32_t
uint64	uint64_t
float32	float
float64	double
boolean	bool
date32	int32_t
date64	int64_t
timestamp	int64_t
time32	int32_t
time64	int64_t
interval_month	int32_t
interval_day_time	int64_t
utf8（作为参数类型）	const char*、uint32_t
utf8（作为返回类型）	int64_t context、const char、uint32_t
binary（作为参数类型）	const char*、uint32_t
utf8（作为返回类型）	int64_t context、const char、uint32_t

总结

抛开不知道为什么项目文件不区分文件夹的问题，项目代码质量很高，关键点的注释和测试样例可以让人理解Gandiva做的事情，很有意思。

虽然有提供Gandiva外部函数的相关手册，但具体怎么用的话，还是要是要看测试样例

项目历史&现状简述

虽然网传这个项目烂尾（根本就没这回事好吧😅），但事实是Gandiva一直都有commit进行维护，今年LLVM20出来以后也很快做了跟进

目前Gandiva有C和C++的相关库，但对于Rust版本的Arrow似乎就不提供相关支持了：Interfaces for gandiva bindings.

源码解析

代码下载于2025.6.24，所有代码均平铺在单层目录上

Gandiva源码的地址：https://github.com/apache/arrow/tree/main/cpp/src/gandiva

|-- CMakeLists.txt
|-- GandivaConfig.cmake.in
|-- annotator.cc
|-- annotator.h
|-- annotator_test.cc
|-- arrow.h
|-- basic_decimal_scalar.h
|-- bitmap_accumulator.cc
|-- bitmap_accumulator.h
|-- bitmap_accumulator_test.cc
|-- cache.cc
|-- cache.h
|-- cache_test.cc
|-- cast_time.cc
|-- compiled_expr.h
|-- condition.h
|-- configuration.cc
|-- configuration.h
|-- context_helper.cc
|-- date_utils.cc
|-- date_utils.h
|-- decimal_ir.cc
|-- decimal_ir.h
|-- decimal_scalar.h
|-- decimal_type_util.cc
|-- decimal_type_util.h
|-- decimal_type_util_test.cc
|-- decimal_xlarge.cc
|-- decimal_xlarge.h
|-- dex.h
|-- dex_visitor.h
|-- encrypt_utils.cc
|-- encrypt_utils.h
|-- encrypt_utils_test.cc
|-- engine.cc
|-- engine.h
|-- engine_llvm_test.cc
|-- eval_batch.h
|-- execution_context.h
|-- exported_funcs.cc
|-- exported_funcs.h
|-- exported_funcs_registry.cc
|-- exported_funcs_registry.h
|-- exported_funcs_registry_test.cc
|-- expr_decomposer.cc
|-- expr_decomposer.h
|-- expr_decomposer_test.cc
|-- expr_validator.cc
|-- expr_validator.h
|-- expression.cc
|-- expression.h
|-- expression_cache_key.h
|-- expression_registry.cc
|-- expression_registry.h
|-- expression_registry_test.cc
|-- external_c_functions.cc
|-- field_descriptor.h
|-- filter.cc
|-- filter.h
|-- formatting_utils.h
|-- func_descriptor.h
|-- function_holder.h
|-- function_holder_maker_registry.cc
|-- function_holder_maker_registry.h
|-- function_ir_builder.cc
|-- function_ir_builder.h
|-- function_registry.cc
|-- function_registry.h
|-- function_registry_arithmetic.cc
|-- function_registry_arithmetic.h
|-- function_registry_common.h
|-- function_registry_datetime.cc
|-- function_registry_datetime.h
|-- function_registry_hash.cc
|-- function_registry_hash.h
|-- function_registry_math_ops.cc
|-- function_registry_math_ops.h
|-- function_registry_string.cc
|-- function_registry_string.h
|-- function_registry_test.cc
|-- function_registry_timestamp_arithmetic.cc
|-- function_registry_timestamp_arithmetic.h
|-- function_signature.cc
|-- function_signature.h
|-- function_signature_test.cc
|-- gandiva.pc.in
|-- gandiva_aliases.h
|-- gandiva_object_cache.cc
|-- gandiva_object_cache.h
|-- gdv_function_stubs.cc
|-- gdv_function_stubs.h
|-- gdv_function_stubs_test.cc
|-- gdv_hash_function_stubs.cc
|-- gdv_string_function_stubs.cc
|-- hash_utils.cc
|-- hash_utils.h
|-- hash_utils_test.cc
|-- in_holder.h
|-- interval_holder.cc
|-- interval_holder.h
|-- interval_holder_test.cc
|-- literal_holder.cc
|-- literal_holder.h
|-- llvm_generator.cc
|-- llvm_generator.h
|-- llvm_generator_test.cc
|-- llvm_includes.h
|-- llvm_types.cc
|-- llvm_types.h
|-- llvm_types_test.cc
|-- local_bitmaps_holder.h
|-- lru_cache.h
|-- lru_cache_test.cc
|-- lvalue.h
|-- make_precompiled_bitcode.py
|-- native_function.h
|-- node.h
|-- node_visitor.h
|-- precompiled
|   |-- CMakeLists.txt
|   |-- arithmetic_ops.cc
|   |-- arithmetic_ops_test.cc
|   |-- bitmap.cc
|   |-- bitmap_test.cc
|   |-- decimal_ops.cc
|   |-- decimal_ops.h
|   |-- decimal_ops_test.cc
|   |-- decimal_wrapper.cc
|   |-- epoch_time_point.h
|   |-- epoch_time_point_test.cc
|   |-- extended_math_ops.cc
|   |-- extended_math_ops_test.cc
|   |-- hash.cc
|   |-- hash_test.cc
|   |-- print.cc
|   |-- string_ops.cc
|   |-- string_ops_test.cc
|   |-- testing.h
|   |-- time.cc
|   |-- time_constants.h
|   |-- time_fields.h
|   |-- time_test.cc
|   |-- timestamp_arithmetic.cc
|   `-- types.h
|-- precompiled_bitcode.cc.in
|-- projector.cc
|-- projector.h
|-- random_generator_holder.cc
|-- random_generator_holder.h
|-- random_generator_holder_test.cc
|-- regex_functions_holder.cc
|-- regex_functions_holder.h
|-- regex_functions_holder_test.cc
|-- regex_util.cc
|-- regex_util.h
|-- selection_vector.cc
|-- selection_vector.h
|-- selection_vector_impl.h
|-- selection_vector_test.cc
|-- simple_arena.h
|-- simple_arena_test.cc
|-- symbols.map
|-- tests
|   |-- CMakeLists.txt
|   |-- binary_test.cc
|   |-- boolean_expr_test.cc
|   |-- date_time_test.cc
|   |-- decimal_single_test.cc
|   |-- decimal_test.cc
|   |-- external_functions
|   |   |-- CMakeLists.txt
|   |   |-- multiply_by_two.cc
|   |   `-- multiply_by_two.h
|   |-- filter_project_test.cc
|   |-- filter_test.cc
|   |-- generate_data.h
|   |-- hash_test.cc
|   |-- huge_table_test.cc
|   |-- if_expr_test.cc
|   |-- in_expr_test.cc
|   |-- literal_test.cc
|   |-- micro_benchmarks.cc
|   |-- null_validity_test.cc
|   |-- projector_build_validation_test.cc
|   |-- projector_test.cc
|   |-- test_util.cc
|   |-- test_util.h
|   |-- timed_evaluate.h
|   |-- to_string_test.cc
|   `-- utf8_test.cc
|-- to_date_holder.cc
|-- to_date_holder.h
|-- to_date_holder_test.cc
|-- tree_expr_builder.cc
|-- tree_expr_builder.h
|-- tree_expr_test.cc
|-- value_validity_pair.h
`-- visibility.h

由于代码量极大，只选取部分进行分析

node

关于Tree的Node的定义

namespace gandiva {
class FieldNode;
class FunctionNode;
class IfNode;
class LiteralNode;
class BooleanNode;
template <typename Type>
class InExpressionNode;
/// \brief Visitor for nodes in the expression tree.
class GANDIVA_EXPORT NodeVisitor {
 public:
  virtual ~NodeVisitor() = default;
  virtual Status Visit(const FieldNode& node) = 0;
  virtual Status Visit(const FunctionNode& node) = 0;
  virtual Status Visit(const IfNode& node) = 0;
  virtual Status Visit(const LiteralNode& node) = 0;
  virtual Status Visit(const BooleanNode& node) = 0;
  virtual Status Visit(const InExpressionNode<int32_t>& node) = 0;
  virtual Status Visit(const InExpressionNode<int64_t>& node) = 0;
  virtual Status Visit(const InExpressionNode<float>& node) = 0;
  virtual Status Visit(const InExpressionNode<double>& node) = 0;
  virtual Status Visit(const InExpressionNode<gandiva::DecimalScalar128>& node) = 0;
  virtual Status Visit(const InExpressionNode<std::string>& node) = 0;
};
}  // namespace gandiva

tree_expr

tree_expr_test.cc
tree_expr_builder.cc
tree_expr_builder.h

用于解析计算树，比如4*5+3这种，通过TreeExprBuilder完成树的构建

TEST_F(TestExprTree, TestField) {
  Annotator annotator;
  auto n0 = TreeExprBuilder::MakeField(i0_);
  EXPECT_EQ(n0->return_type(), int32());
  auto n1 = TreeExprBuilder::MakeField(b0_);
  EXPECT_EQ(n1->return_type(), boolean());
  ExprDecomposer decomposer(*registry_, annotator);
  ValueValidityPairPtr pair;
  auto status = decomposer.Decompose(*n1, &pair);
  DCHECK_EQ(status.ok(), true) << status.message();
  auto value = pair->value_expr();
  auto value_dex = std::dynamic_pointer_cast<VectorReadFixedLenValueDex>(value);
  EXPECT_EQ(value_dex->FieldType(), boolean());
  EXPECT_EQ(pair->validity_exprs().size(), 1);
  auto validity = pair->validity_exprs().at(0);
  auto validity_dex = std::dynamic_pointer_cast<VectorReadValidityDex>(validity);
  EXPECT_NE(validity_dex->ValidityIdx(), value_dex->DataIdx());
}

借助函数重载,使用访问者模式，实现树的遍历与转换

class GANDIVA_EXPORT TreeExprBuilder {
 public:
  /// \brief create a node on a literal.
  static NodePtr MakeLiteral(bool value);
  static NodePtr MakeLiteral(uint8_t value);
  static NodePtr MakeLiteral(uint16_t value);
  static NodePtr MakeLiteral(uint32_t value);
  static NodePtr MakeLiteral(uint64_t value);
  static NodePtr MakeLiteral(int8_t value);
  static NodePtr MakeLiteral(int16_t value);
  static NodePtr MakeLiteral(int32_t value);
  static NodePtr MakeLiteral(int64_t value);
  static NodePtr MakeLiteral(float value);
  static NodePtr MakeLiteral(double value);
  static NodePtr MakeStringLiteral(const std::string& value);
  static NodePtr MakeBinaryLiteral(const std::string& value);
  static NodePtr MakeDecimalLiteral(const DecimalScalar128& value);

to_date_holder

完成字符串往时间的转化

EST_F(TestToDateHolder, TestSimpleDateTime) {
  EXPECT_OK_AND_ASSIGN(auto to_date_holder, ToDateHolder::Make("YYYY-MM-DD HH:MI:SS", 1));
  auto& to_date = *to_date_holder;
  bool out_valid;
  std::string s("1986-12-01 01:01:01");
  int64_t millis_since_epoch =
      to_date(&execution_context_, s.data(), (int)s.length(), true, &out_valid);
  EXPECT_EQ(millis_since_epoch, 533779200000);
  s = std::string("1986-12-01 01:01:01.11");
  millis_since_epoch =
      to_date(&execution_context_, s.data(), (int)s.length(), true, &out_valid);
  EXPECT_EQ(millis_since_epoch, 533779200000);
  s = std::string("1986-12-01 01:01:01 +0800");
  millis_since_epoch =
      to_date(&execution_context_, s.data(), (int)s.length(), true, &out_valid);
  EXPECT_EQ(millis_since_epoch, 533779200000);
#if 0
  // TODO : this fails parsing with date::parse and strptime on linux
  s = std::string("1886-12-01 00:00:00");
  millis_since_epoch =
      to_date(&execution_context_, s.data(), (int) s.length(), true, &out_valid);
  EXPECT_EQ(out_valid, true);
  EXPECT_EQ(millis_since_epoch, -2621894400000);
#endif
  s = std::string("1886-12-01 01:01:01");
  millis_since_epoch =
      to_date(&execution_context_, s.data(), (int)s.length(), true, &out_valid);
  EXPECT_EQ(millis_since_epoch, -2621894400000);
  s = std::string("1986-12-11 01:30:00");
  millis_since_epoch =
      to_date(&execution_context_, s.data(), (int)s.length(), true, &out_valid);
  EXPECT_EQ(millis_since_epoch, 534643200000);
}

simple_arena

没太理解内容，似乎是关于内存分配处理的内容，实现以Trunk为单位的内存分配

TEST_F(TestSimpleArena, TestAlloc) {
  int64_t chunk_size = 4096;
  SimpleArena arena(arrow::default_memory_pool(), chunk_size);
  // Small allocations should come from the same chunk.
  int64_t small_size = 100;
  for (int64_t i = 0; i < 20; ++i) {
    auto p = arena.Allocate(small_size);
    EXPECT_NE(p, nullptr);
    EXPECT_EQ(arena.total_bytes(), chunk_size);
    EXPECT_EQ(arena.avail_bytes(), chunk_size - (i + 1) * small_size);
  }
  // large allocations require separate chunks
  int64_t large_size = 100 * chunk_size;
  auto p = arena.Allocate(large_size);
  EXPECT_NE(p, nullptr);
  EXPECT_EQ(arena.total_bytes(), chunk_size + large_size);
  EXPECT_EQ(arena.avail_bytes(), 0);
}

selection_vector

实现对于Arrow格式存储的选择向量（Selection Vector）

这里需要补充下关于选择向量的相关知识

Selection Vector 是一种在数据处理系统中使用的技术，用来表示一批数据中哪些行被选中（有效），从而避免对不相关的数据行进行操作。它常见于列式数据库、矢量化执行引擎（如 Apache Arrow、Dremio、Gandiva）中，用于提升性能。

Selection Vector（选择向量）本质上是一个索引数组，存储的是被选中行在原始数据批中的下标。

避免复制数据：只需操作向量而不移动原始数据。

高效过滤：可以快速跳过不符合条件的行。

矢量化执行支持：配合批处理（batch processing），提升 SIMD 性能。

落到具体选择上，可能就是bitmap或是个Set

TEST_F(TestSelectionVector, TestInt16Set) {
  int max_slots = 10;
  std::shared_ptr<SelectionVector> selection;
  auto status = SelectionVector::MakeInt16(max_slots, pool_, &selection);
  EXPECT_EQ(status.ok(), true) << status.message();
  selection->SetIndex(0, 100);
  EXPECT_EQ(selection->GetIndex(0), 100);
  selection->SetIndex(1, 200);
  EXPECT_EQ(selection->GetIndex(1), 200);
  selection->SetNumSlots(2);
  EXPECT_EQ(selection->GetNumSlots(), 2);
  // TopArray() should return an array with 100,200
  auto array_raw = selection->ToArray();
  const auto& array = dynamic_cast<const arrow::UInt16Array&>(*array_raw);
  EXPECT_EQ(array.length(), 2) << array_raw->ToString();
  EXPECT_EQ(array.Value(0), 100) << array_raw->ToString();
  EXPECT_EQ(array.Value(1), 200) << array_raw->ToString();
}

也可以通过Bitmap实现向量选择

TEST_F(TestSelectionVector, TestInt64PopulateFromBitMap) {
  int max_slots = 200;
  std::shared_ptr<SelectionVector> selection;
  auto status = SelectionVector::MakeInt64(max_slots, pool_, &selection);
  EXPECT_EQ(status.ok(), true) << status.message();
  int bitmap_size = RoundUpNumi64(max_slots) * 8;
  std::vector<uint8_t> bitmap(bitmap_size);
  arrow::bit_util::SetBit(&bitmap[0], 0);
  arrow::bit_util::SetBit(&bitmap[0], 5);
  arrow::bit_util::SetBit(&bitmap[0], 121);
  arrow::bit_util::SetBit(&bitmap[0], 220);
  status = selection->PopulateFromBitMap(&bitmap[0], bitmap_size, max_slots - 1);
  EXPECT_EQ(status.ok(), true) << status.message();
  EXPECT_EQ(selection->GetNumSlots(), 3);
  EXPECT_EQ(selection->GetIndex(0), 0);
  EXPECT_EQ(selection->GetIndex(1), 5);
  EXPECT_EQ(selection->GetIndex(2), 121);
}

regex_functions/util

正则表达式相关，似乎能检测SQL相关的符号，这部分使用了Google的re2库，参考PCRE（Perl Compatible Regular Expressions）实现标准

const std::set<char> RegexUtil::pcre_regex_specials_ = {
    '[', ']', '(', ')', '|', '^', '-', '+', '*', '?', '{', '}', '$', '\\', '.'};

而测试也基本围绕些简易字符串展开

你甚至能看到关于中文字符的检测，这可太稀罕了，C++的UTF-8识别这块我一直摸不着头脑😂

  input_string = "路%c$大";
  extract_index = 2;  // Retrieve all matched string
  ret = extract_numbers(&execution_context_, input_string.c_str(),
                        static_cast<int32_t>(input_string.length()), extract_index,
                        &out_length);
  ret_as_str = std::string(ret, out_length);
  EXPECT_EQ(out_length, 1);
  EXPECT_EQ(ret_as_str, "c");

random_generator

随机数生成器，里面包含了随机种子信息

namespace gandiva {
/// Function Holder for 'random'
class GANDIVA_EXPORT RandomGeneratorHolder : public FunctionHolder {
 public:
  ~RandomGeneratorHolder() override = default;
  static Result<std::shared_ptr<RandomGeneratorHolder>> Make(const FunctionNode& node);
  double operator()() { return distribution_(generator_); }
 private:
  explicit RandomGeneratorHolder(int seed) : distribution_(0, 1) {
    int64_t seed64 = static_cast<int64_t>(seed);
    seed64 = (seed64 ^ 0x00000005DEECE66D) & 0x0000ffffffffffff;
    generator_.seed(static_cast<uint64_t>(seed64));
  }
  RandomGeneratorHolder() : distribution_(0, 1) {
    generator_.seed(::arrow::internal::GetRandomSeed());
  }
  std::mt19937_64 generator_;
  std::uniform_real_distribution<> distribution_;
};
}  // namespace gandiva

project

关于Gandiva如何处理Apache Arrow的Project的代码了，

/// \brief projection using expressions.

///

/// A projector is built for a specific schema and vector of expressions.

/// Once the projector is built, it can be used to evaluate many row batches.

看以看到实现中LLVM Generator，output_fields，是否使用已有的缓存，以及代码生成设置相关属性

  std::unique_ptr<LLVMGenerator> llvm_generator_;
  SchemaPtr schema_;
  FieldVector output_fields_;
  std::shared_ptr<Configuration> configuration_;
  bool built_from_cache_;
};

这里面还涉及了关于数据缓冲区的代码

Status Projector::AllocArrayData(const DataTypePtr& type, int64_t num_records,
                                 arrow::MemoryPool* pool,
                                 ArrayDataPtr* array_data) const {
  arrow::Status astatus;
  std::vector<std::shared_ptr<arrow::Buffer>> buffers;
  // The output vector always has a null bitmap.
  int64_t size = arrow::bit_util::BytesForBits(num_records);
  ARROW_ASSIGN_OR_RAISE(auto bitmap_buffer, arrow::AllocateBuffer(size, pool));
  buffers.push_back(std::move(bitmap_buffer));
  // String/Binary vectors have an offsets array.
  auto type_id = type->id();
  if (arrow::is_binary_like(type_id)) {
    auto offsets_len = arrow::bit_util::BytesForBits((num_records + 1) * 32);
    ARROW_ASSIGN_OR_RAISE(auto offsets_buffer, arrow::AllocateBuffer(offsets_len, pool));
    buffers.push_back(std::move(offsets_buffer));
  }
  // The output vector always has a data array.
  int64_t data_len;
  if (arrow::is_primitive(type_id) || type_id == arrow::Type::DECIMAL) {
    const auto& fw_type = static_cast<const arrow::FixedWidthType&>(*type);
    data_len = arrow::bit_util::BytesForBits(num_records * fw_type.bit_width());
  } else if (arrow::is_binary_like(type_id)) {
    // we don't know the expected size for varlen output vectors.
    data_len = 0;
  } else {
    return Status::Invalid("Unsupported output data type " + type->ToString());
  }
  ARROW_ASSIGN_OR_RAISE(auto data_buffer, arrow::AllocateResizableBuffer(data_len, pool));
  // This is not strictly required but valgrind gets confused and detects this
  // as uninitialized memory access. See arrow::util::SetBitTo().
  if (type->id() == arrow::Type::BOOL) {
    memset(data_buffer->mutable_data(), 0, data_len);
  }
  buffers.push_back(std::move(data_buffer));
  *array_data = arrow::ArrayData::Make(type, num_records, std::move(buffers));
  return Status::OK();
}

有点奇怪的是这部分内容没有没有配备test

lru_cache

从Boost库修改的LRU Cache，因为代码使用了模板，所以这里看不出来是存了什么

// modified from boost LRU cache -> the boost cache supported only an
// ordered map.
namespace gandiva {
// a cache which evicts the least recently used item when it is full
template <class Key, class Value>
class LruCache {
 public:
  using key_type = Key;
  using value_type = Value;
  using list_type = std::list<key_type>;

测试代码是直接使用string

TEST_F(TestLruCache, TestLruBehavior) {
  cache_.insert(TestCacheKey(1), "hello");
  cache_.insert(TestCacheKey(2), "hello");
  cache_.get(TestCacheKey(1));
  cache_.insert(TestCacheKey(3), "hello");
  // should have evicted key 2.
  ASSERT_EQ(*cache_.get(TestCacheKey(1)), "hello");
}

llvm_types

有一个llvm_types用于全局的types生成管理，用于映射Arrow的类型，这样的代码也能在NoisePage里面找到

class GANDIVA_EXPORT LLVMTypes {
 public:
  explicit LLVMTypes(llvm::LLVMContext& context);
  llvm::Type* void_type() { return llvm::Type::getVoidTy(context_); }
  llvm::Type* i1_type() { return llvm::Type::getInt1Ty(context_); }
  llvm::Type* i8_type() { return llvm::Type::getInt8Ty(context_); }
  llvm::Type* i16_type() { return llvm::Type::getInt16Ty(context_); }
  llvm::Type* i32_type() { return llvm::Type::getInt32Ty(context_); }
  llvm::Type* i64_type() { return llvm::Type::getInt64Ty(context_); }
  llvm::Type* i128_type() { return llvm::Type::getInt128Ty(context_); }
  llvm::StructType* i128_split_type() {
    // struct with high/low bits (see decimal_ops.cc:DecimalSplit)
    return llvm::StructType::get(context_, {i64_type(), i64_type()}, false);
  }

以及一些简单的内容初始化

  llvm::Constant* i128_zero() { return i128_constant(0); }
  llvm::Constant* i128_one() { return i128_constant(1); }

llvm_includes

开头的关闭MSVC的警告可以记录以下，这是我头一回遇到，看以看出Gandiva是能在Windows上面运行的

#if defined(_MSC_VER)
#  pragma warning(push)
#  pragma warning(disable : 4141)
#  pragma warning(disable : 4146)
#  pragma warning(disable : 4244)
#  pragma warning(disable : 4267)
#  pragma warning(disable : 4291)
#  pragma warning(disable : 4624)
#endif

甚至还考虑到了不同LLVM版本的情况

#if LLVM_VERSION_MAJOR >= 10
#  define LLVM_ALIGN(alignment) (llvm::Align((alignment)))
#else
#  define LLVM_ALIGN(alignment) (alignment)
#endif

llvm_generator

最为核心的LLVM代码生成

生成器似乎可以对缓存有效利用

class GANDIVA_EXPORT LLVMGenerator {
 public:
  /// \brief Factory method to initialize the generator.
  static Result<std::unique_ptr<LLVMGenerator>> Make(
      const std::shared_ptr<Configuration>& config, bool cached,
      std::optional<std::reference_wrapper<GandivaObjectCache>> object_cache =
          std::nullopt);
  /// \brief Get the cache to be used for LLVM ObjectCache.
  static std::shared_ptr<Cache<ExpressionCacheKey, std::shared_ptr<llvm::MemoryBuffer>>>
  GetCache();

存储关于SelectionVector：：Mode的信息

SelectionVector::Mode selection_vector_mode() { return selection_vector_mode_; }

build将表达式输入生成代码

  /// \brief Build the code for the expression trees for default mode with a LLVM
  /// ObjectCache. Each element in the vector represents an expression tree
  Status Build(const ExpressionVector& exprs, SelectionVector::Mode mode);
  /// \brief Build the code for the expression trees for default mode. Each
  /// element in the vector represents an expression tree
  Status Build(const ExpressionVector& exprs);

execute将Arrow量输入LLVM IR函数

  /// \brief Execute the built expression against the provided arguments for
  /// default mode.
  Status Execute(const arrow::RecordBatch& record_batch,
                 const ArrayDataVector& output_vector) const;
  /// \brief Execute the built expression against the provided arguments for
  /// all modes. Only works on the records specified in the selection_vector.
  Status Execute(const arrow::RecordBatch& record_batch,
                 const SelectionVector* selection_vector,
                 const ArrayDataVector& output_vector) const;

基本LLVMContext和IRbuilder自然是少不了，但这里的创建Global String居然不用检查重复，不知道是疏忽，还是因为前边有检查😂

  llvm::LLVMContext* context() { return engine_->context(); }
  llvm::IRBuilder<>* ir_builder() { return engine_->ir_builder(); }
  llvm::Constant* CreateGlobalStringPtr(const std::string& string) {
    return engine_->CreateGlobalStringPtr(string);
  }

然后Vistor模式重新过一遍解析树

 class Visitor : public DexVisitor {
   public:
    Visitor(LLVMGenerator* generator, llvm::Function* function,
            llvm::BasicBlock* entry_block, llvm::Value* arg_addrs,
            llvm::Value* arg_local_bitmaps, llvm::Value* arg_holder_ptrs,
            std::vector<llvm::Value*> slice_offsets, llvm::Value* arg_context_ptr,
            llvm::Value* loop_var);
    void Visit(const VectorReadValidityDex& dex) override;
    void Visit(const VectorReadFixedLenValueDex& dex) override;
    void Visit(const VectorReadVarLenValueDex& dex) override;
    void Visit(const LocalBitMapValidityDex& dex) override;
    void Visit(const TrueDex& dex) override;
    void Visit(const FalseDex& dex) override;
    void Visit(const LiteralDex& dex) override;
    void Visit(const NonNullableFuncDex& dex) override;
    void Visit(const NullableNeverFuncDex& dex) override;
    void Visit(const NullableInternalFuncDex& dex) override;
    void Visit(const IfDex& dex) override;
    void Visit(const BooleanAndDex& dex) override;
    void Visit(const BooleanOrDex& dex) override;
    void Visit(const InExprDexBase<int32_t>& dex) override;
    void Visit(const InExprDexBase<int64_t>& dex) override;
    void Visit(const InExprDexBase<float>& dex) override;
    void Visit(const InExprDexBase<double>& dex) override;
    void Visit(const InExprDexBase<gandiva::DecimalScalar128>& dex) override;
    void Visit(const InExprDexBase<std::string>& dex) override;
    template <typename Type>
    void VisitInExpression(const InExprDexBase<Type>& dex);
    LValuePtr result() { return result_; }
    bool has_arena_allocs() { return has_arena_allocs_; }

还有专门关于LLVM函数生成与函数调用的函数

    std::vector<llvm::Value*> BuildParams(int holder_idx,
                                          const ValueValidityPairVector& args,
                                          bool with_validity, bool with_context);
    // Generate code to invoke a function call.
    LValuePtr BuildFunctionCall(const NativeFunction* func, DataTypePtr arrow_return_type,
                                std::vector<llvm::Value*>* params);
    // Generate code for an if-else condition.
    LValuePtr BuildIfElse(llvm::Value* condition, std::function<LValuePtr()> then_func,
                          std::function<LValuePtr()> else_func,
                          DataTypePtr arrow_return_type);

通过接口添加预定义的LLVM IR函数

  /// Generate code to make a function call (to a pre-compiled IR function) which takes
  /// 'args' and has a return type 'ret_type'.
  llvm::Value* AddFunctionCall(const std::string& full_name, llvm::Type* ret_type,
                               const std::vector<llvm::Value*>& args);

关于Cache的详细实现

std::shared_ptr<Cache<ExpressionCacheKey, std::shared_ptr<llvm::MemoryBuffer>>>
LLVMGenerator::GetCache() {
  static std::shared_ptr<Cache<ExpressionCacheKey, std::shared_ptr<llvm::MemoryBuffer>>>
      shared_cache = std::make_shared<
          Cache<ExpressionCacheKey, std::shared_ptr<llvm::MemoryBuffer>>>();
  return shared_cache;
}
Status LLVMGenerator::SetLLVMObjectCache(GandivaObjectCache& object_cache) {
  return engine_->SetLLVMObjectCache(object_cache);
}

build的部分实现

Status LLVMGenerator::Build(const ExpressionVector& exprs, SelectionVector::Mode mode) {
  selection_vector_mode_ = mode;
  for (auto& expr : exprs) {
    auto output = annotator_.AddOutputFieldDescriptor(expr->result());
    ARROW_RETURN_NOT_OK(Add(expr, output));
  }
  // Compile and inject into the process' memory the generated function.
  ARROW_RETURN_NOT_OK(engine_->FinalizeModule());
  // setup the jit functions for each expression.
  for (auto& compiled_expr : compiled_exprs_) {
    auto fn_name = compiled_expr->GetFunctionName(mode);
    ARROW_ASSIGN_OR_RAISE(auto fn_ptr, engine_->CompiledFunction(fn_name));
    auto jit_fn = reinterpret_cast<EvalFunc>(fn_ptr);
    compiled_expr->SetJITFunction(selection_vector_mode_, jit_fn);
  }
  return Status::OK();
}

这部分的详细内容有空的话值得细看，而关于Test的话，这边给的示范样例是LLVM自动向量化向量加

TEST_F(TestLLVMGenerator, TestAdd) {
  // Setup LLVM generator to do an arithmetic add of two vectors
  ASSERT_OK_AND_ASSIGN(auto generator,
                       LLVMGenerator::Make(TestConfigWithIrDumping(), false));
  Annotator annotator;
  auto field0 = std::make_shared<arrow::Field>("f0", arrow::int32());
  auto desc0 = annotator.CheckAndAddInputFieldDescriptor(field0);
  auto validity_dex0 = std::make_shared<VectorReadValidityDex>(desc0);
  auto value_dex0 = std::make_shared<VectorReadFixedLenValueDex>(desc0);
  auto pair0 = std::make_shared<ValueValidityPair>(validity_dex0, value_dex0);
  auto field1 = std::make_shared<arrow::Field>("f1", arrow::int32());
  auto desc1 = annotator.CheckAndAddInputFieldDescriptor(field1);
  auto validity_dex1 = std::make_shared<VectorReadValidityDex>(desc1);
  auto value_dex1 = std::make_shared<VectorReadFixedLenValueDex>(desc1);
  auto pair1 = std::make_shared<ValueValidityPair>(validity_dex1, value_dex1);
  DataTypeVector params{arrow::int32(), arrow::int32()};
  auto func_desc = std::make_shared<FuncDescriptor>("add", params, arrow::int32());
  FunctionSignature signature(func_desc->name(), func_desc->params(),
                              func_desc->return_type());
  const NativeFunction* native_func =
      generator->function_registry_->LookupSignature(signature);
  std::vector<ValueValidityPairPtr> pairs{pair0, pair1};
  auto func_dex = std::make_shared<NonNullableFuncDex>(
      func_desc, native_func, FunctionHolderPtr(nullptr), -1, pairs);
  auto field_sum = std::make_shared<arrow::Field>("out", arrow::int32());
  auto desc_sum = annotator.CheckAndAddInputFieldDescriptor(field_sum);
  // LLVM 10 doesn't like the expr function name to be the same as the module name when
  // LLJIT is used
  std::string fn_name = "llvm_gen_test_add_expr";
  ASSERT_OK(generator->engine_->LoadFunctionIRs());
  ASSERT_OK(generator->CodeGenExprValue(func_dex, 4, desc_sum, 0, fn_name,
                                        SelectionVector::MODE_NONE));
  ASSERT_OK(generator->engine_->FinalizeModule());
  auto const& ir = generator->engine_->ir();
  EXPECT_THAT(ir, testing::HasSubstr("vector.body"));
  ASSERT_OK_AND_ASSIGN(auto fn_ptr, generator->engine_->CompiledFunction(fn_name));
  ASSERT_TRUE(fn_ptr);
  auto eval_func = reinterpret_cast<EvalFunc>(fn_ptr);
  constexpr size_t kNumRecords = 4;
  std::array<uint32_t, kNumRecords> a0{1, 2, 3, 4};
  std::array<uint32_t, kNumRecords> a1{5, 6, 7, 8};
  uint64_t in_bitmap = 0xffffffffffffffffull;
  std::array<uint32_t, kNumRecords> out{0, 0, 0, 0};
  uint64_t out_bitmap = 0;
  std::array<uint8_t*, 6> addrs{
      reinterpret_cast<uint8_t*>(a0.data()),  reinterpret_cast<uint8_t*>(&in_bitmap),
      reinterpret_cast<uint8_t*>(a1.data()),  reinterpret_cast<uint8_t*>(&in_bitmap),
      reinterpret_cast<uint8_t*>(out.data()), reinterpret_cast<uint8_t*>(&out_bitmap),
  };
  std::array<int64_t, 6> addr_offsets{0, 0, 0, 0, 0, 0};
  eval_func(addrs.data(), addr_offsets.data(), nullptr, nullptr, nullptr,
            0 /* dummy context ptr */, kNumRecords);
  EXPECT_THAT(out, testing::ElementsAre(6, 8, 10, 12));
  EXPECT_EQ(out_bitmap, 0ULL);
}

这一块可以看下GPT的解析

这个 TestAdd 测试用例是在测试 LLVM 代码生成器生成向量加法运算的完整流程。让我详细解释：

测试目标

验证 Gandiva 能否正确生成并执行两个 int32 向量相加的 LLVM 代码。

测试流程分析

1. 初始化 LLVM 生成器
ASSERT_OK_AND_ASSIGN(auto generator, LLVMGenerator::Make(TestConfigWithIrDumping(), false));
创建支持 IR 转储的 LLVM 代码生成器

2. 设置输入向量描述符
// 第一个输入向量 f0 (int32)
auto field0 = std::make_shared<arrow::Field>("f0", arrow::int32());
auto desc0 = annotator.CheckAndAddInputFieldDescriptor(field0);
auto validity_dex0 = std::make_shared<VectorReadValidityDex>(desc0);  // 有效性位图
auto value_dex0 = std::make_shared<VectorReadFixedLenValueDex>(desc0); // 数据值
auto pair0 = std::make_shared<ValueValidityPair>(validity_dex0, value_dex0);
// 第二个输入向量 f1 (int32) - 类似的设置
3. 定义加法函数
DataTypeVector params{arrow::int32(), arrow::int32()};
auto func_desc = std::make_shared<FuncDescriptor>("add", params, arrow::int32());
定义一个接受两个 int32 参数、返回 int32 的加法函数

4. 创建函数表达式
auto func_dex = std::make_shared<NonNullableFuncDex>(
    func_desc, native_func, FunctionHolderPtr(nullptr), -1, pairs);
将输入向量对与加法函数绑定

5. 生成 LLVM 代码
ASSERT_OK(generator->CodeGenExprValue(func_dex, 4, desc_sum, 0, fn_name,
                                      SelectionVector::MODE_NONE));
为表达式生成 LLVM IR 代码

处理 4 条记录

6. 编译并获取函数指针
ASSERT_OK_AND_ASSIGN(auto fn_ptr, generator->engine_->CompiledFunction(fn_name));
auto eval_func = reinterpret_cast<EvalFunc>(fn_ptr);
将 LLVM IR 编译为机器码

获取可执行的函数指针

7. 准备测试数据
std::array<uint32_t, kNumRecords> a0{1, 2, 3, 4};  // 第一个向量
std::array<uint32_t, kNumRecords> a1{5, 6, 7, 8};  // 第二个向量
uint64_t in_bitmap = 0xffffffffffffffffull;        // 全部有效
std::array<uint32_t, kNumRecords> out{0, 0, 0, 0}; // 输出向量
8. 执行生成的函数
eval_func(addrs.data(), addr_offsets.data(), nullptr, nullptr, nullptr,
          0 /* dummy context ptr */, kNumRecords);
调用编译后的函数执行向量加法

9. 验证结果
EXPECT_THAT(out, testing::ElementsAre(6, 8, 10, 12));  // 1+5, 2+6, 3+7, 4+8
EXPECT_EQ(out_bitmap, 0ULL);  // 输出位图应为0（全部有效）
关键技术点

1. 向量化优化验证
EXPECT_THAT(ir, testing::HasSubstr("vector.body"));
检查生成的 LLVM IR 中包含向量化循环体

确保 LLVM 进行了向量化优化

2. 内存布局
std::array<uint8_t*, 6> addrs{
    reinterpret_cast<uint8_t*>(a0.data()),  // 第一个向量数据
    reinterpret_cast<uint8_t*>(&in_bitmap), // 第一个向量有效性位图
    reinterpret_cast<uint8_t*>(a1.data()),  // 第二个向量数据
    reinterpret_cast<uint8_t*>(&in_bitmap), // 第二个向量有效性位图
    reinterpret_cast<uint8_t*>(out.data()), // 输出向量数据
    reinterpret_cast<uint8_t*>(&out_bitmap) // 输出向量有效性位图
};
测试意义

这个测试验证了 Gandiva 的核心功能：

表达式解析：将高级表达式转换为内部表示

代码生成：生成高效的 LLVM IR 代码

编译执行：将 IR 编译为机器码并执行

向量化优化：确保生成的代码利用了 SIMD 指令

正确性：验证计算结果的准确性

这是一个端到端的集成测试，确保整个代码生成和执行流水线工作正常。

这里的testing::HasSubstr是GMock的组件

这里可以看到，C语言函数是可以直接register上去的

TEST_F(TestLLVMGenerator, VerifyExtendedCFunctions) {
  VerifyFunctionMapping("multiply_by_three_int32", [](auto registry) {
    return TestConfigWithCFunction(std::move(registry));
  });
//test_util.cc
std::shared_ptr<Configuration> TestConfigWithCFunction(
    std::shared_ptr<FunctionRegistry> registry) {
  return BuildConfigurationWithRegistry(std::move(registry), [](auto reg) {
    return reg->Register(GetTestExternalCFunction(),
                         reinterpret_cast<void*>(multiply_by_three));
  });
}
static int64_t multiply_by_three(int32_t value) { return value * 3; }

literal_holder

Gandiva 中统一表示和处理各种类型的常量值

namespace gandiva {
using LiteralHolder =
    std::variant<bool, float, double, int8_t, int16_t, int32_t, int64_t, uint8_t,
                 uint16_t, uint32_t, uint64_t, std::string, DecimalScalar128>;
GANDIVA_EXPORT std::string ToString(const LiteralHolder& holder);
}  // namespace gandiva

std::variant 是 C++17 引入的一个类型安全的联合体（type-safe union），它可以在运行时保存一个多个预设类型中的一个值，但不会像传统的 union 那样不安全。

Rust 的 enum 枚举类型是 std::variant 的更强版本

Interval_holder

处理各类时间间隔

  // Pass only years and days to cast
  data = "P12Y15D";
  response = cast_interval_day(&execution_context_, data.data(), 7, true, &out_valid);
  qty_days_in_response = 15;
  qty_millis_in_response = 0;
  EXPECT_TRUE(out_valid);
  EXPECT_FALSE(execution_context_.has_error());
  EXPECT_EQ(response, (qty_millis_in_response << 32) | qty_days_in_response);

hash_utils

hash组件用的是OpenSSL，主要是关于Sha类，Md5l类函数

GANDIVA_EXPORT
const char* gdv_sha512_hash(int64_t context, const void* message, size_t message_length,
                            int32_t* out_length) {
  constexpr int sha512_result_length = 128;
  return gdv_hash_using_openssl(context, message, message_length, EVP_sha512(),
                                sha512_result_length, out_length);
}
/// Hashes a generic message using the SHA256 algorithm
GANDIVA_EXPORT
const char* gdv_sha256_hash(int64_t context, const void* message, size_t message_length,
                            int32_t* out_length) {
  constexpr int sha256_result_length = 64;
  return gdv_hash_using_openssl(context, message, message_length, EVP_sha256(),
                                sha256_result_length, out_length);
}
/// Hashes a generic message using the SHA1 algorithm
GANDIVA_EXPORT
const char* gdv_sha1_hash(int64_t context, const void* message, size_t message_length,
                          int32_t* out_length) {
  constexpr int sha1_result_length = 40;
  return gdv_hash_using_openssl(context, message, message_length, EVP_sha1(),
                                sha1_result_length, out_length);
}
GANDIVA_EXPORT
const char* gdv_md5_hash(int64_t context, const void* message, size_t message_length,
                         int32_t* out_length) {
  constexpr int md5_result_length = 32;
  return gdv_hash_using_openssl(context, message, message_length, EVP_md5(),
                                md5_result_length, out_length);
}

gandiva_object_cache

直接对result1 = evaluate("column1 + column2 * 3");这类操作的结果进行缓存，相关操作继承自llvm::ObjectCache，使用llvm::memorybuffer缓存相关代码

class GandivaObjectCache : public llvm::ObjectCache {
 public:
  explicit GandivaObjectCache(
      std::shared_ptr<Cache<ExpressionCacheKey, std::shared_ptr<llvm::MemoryBuffer>>>&
          cache,
      ExpressionCacheKey key);
  ~GandivaObjectCache() {}
  void notifyObjectCompiled(const llvm::Module* M, llvm::MemoryBufferRef Obj);
  std::unique_ptr<llvm::MemoryBuffer> getObject(const llvm::Module* M);
 private:
  ExpressionCacheKey cache_key_;
  std::shared_ptr<Cache<ExpressionCacheKey, std::shared_ptr<llvm::MemoryBuffer>>> cache_;
};

function_signature

给函数上Hash，我猜应该是缓存记录

  EXPECT_EQ(FunctionSignature("extract_month", {arrow::date32()}, arrow::int64()),
            FunctionSignature("extract_month", {local_date32_type_}, local_i64_type_));
TEST_F(TestFunctionSignature, TestHash) {
  FunctionSignature f1("add", {arrow::int32(), arrow::int32()}, arrow::int64());
  FunctionSignature f2("add", {local_i32_type_, local_i32_type_}, local_i64_type_);
  EXPECT_EQ(f1.Hash(), f2.Hash());
  FunctionSignature f3("extractDay", {arrow::int64()}, arrow::int64());
  FunctionSignature f4("extractday", {arrow::int64()}, arrow::int64());
  EXPECT_EQ(f3.Hash(), f4.Hash());
}

function_register

class GANDIVA_EXPORT FunctionRegistry {
 public:
  using iterator = const NativeFunction*;
  using FunctionHolderMaker =
      std::function<arrow::Result<std::shared_ptr<FunctionHolder>>(
          const FunctionNode& function_node)>;
  FunctionRegistry();
  FunctionRegistry(const FunctionRegistry&) = delete;
  FunctionRegistry& operator=(const FunctionRegistry&) = delete;
  /// Lookup a pre-compiled function by its signature.
  const NativeFunction* LookupSignature(const FunctionSignature& signature) const;
  /// \brief register a set of functions into the function registry from a given bitcode
  /// file
  arrow::Status Register(const std::vector<NativeFunction>& funcs,
                         const std::string& bitcode_path);
  /// \brief register a set of functions into the function registry from a given bitcode
  /// buffer
  arrow::Status Register(const std::vector<NativeFunction>& funcs,
                         std::shared_ptr<arrow::Buffer> bitcode_buffer);
  /// \brief register a C function into the function registry
  /// @param func the registered function's metadata
  /// @param c_function_ptr the function pointer to the
  /// registered function's implementation
  /// @param function_holder_maker this will be used as the function holder if the
  /// function requires a function holder
  arrow::Status Register(
      NativeFunction func, void* c_function_ptr,
      std::optional<FunctionHolderMaker> function_holder_maker = std::nullopt);
  /// \brief get a list of bitcode memory buffers saved in the registry
  const std::vector<std::shared_ptr<arrow::Buffer>>& GetBitcodeBuffers() const;
  /// \brief get a list of C functions saved in the registry
  const std::vector<std::pair<NativeFunction, void*>>& GetCFunctions() const;
  const FunctionHolderMakerRegistry& GetFunctionHolderMakerRegistry() const;
  iterator begin() const;
  iterator end() const;
  iterator back() const;
  friend arrow::Result<std::shared_ptr<FunctionRegistry>> MakeDefaultFunctionRegistry();
 private:
  std::vector<NativeFunction> pc_registry_;
  SignatureMap pc_registry_map_;
  std::vector<std::shared_ptr<arrow::Buffer>> bitcode_memory_buffers_;
  std::vector<std::pair<NativeFunction, void*>> c_functions_;
  FunctionHolderMakerRegistry holder_maker_registry_;
  Status Add(NativeFunction func);
};
/// \brief get the default function registry
GANDIVA_EXPORT std::shared_ptr<FunctionRegistry> default_function_registry();
}  // namespace gandiva

function_ir_builder

一个十分通用的IR生成器（这玩意我怎么之前没想到过呢.jpg)，甚至能实现If-else的block块跳转

class FunctionIRBuilder {
 public:
  explicit FunctionIRBuilder(Engine* engine) : engine_(engine) {}
  virtual ~FunctionIRBuilder() = default;
 protected:
  LLVMTypes* types() { return engine_->types(); }
  llvm::Module* module() { return engine_->module(); }
  llvm::LLVMContext* context() { return engine_->context(); }
  llvm::IRBuilder<>* ir_builder() { return engine_->ir_builder(); }
  llvm::Constant* CreateGlobalStringPtr(const std::string& string) {
    return engine_->CreateGlobalStringPtr(string);
  }
  /// Build an if-else block.
  llvm::Value* BuildIfElse(llvm::Value* condition, llvm::Type* return_type,
                           std::function<llvm::Value*()> then_func,
                           std::function<llvm::Value*()> else_func);
  struct NamedArg {
    std::string name;
    llvm::Type* type;
  };
  /// Build llvm fn.
  llvm::Function* BuildFunction(const std::string& function_name, llvm::Type* return_type,
                                std::vector<NamedArg> in_args);
 private:
  Engine* engine_;
};

filter

这部分也是在LLVM中实现，看起来和Project差不多

 private:
  std::unique_ptr<LLVMGenerator> llvm_generator_;
  SchemaPtr schema_;
  std::shared_ptr<Configuration> configuration_;
  bool built_from_cache_;

如果想要添加缓存，直接SetLLVMObjectCache即可

Status Engine::SetLLVMObjectCache(GandivaObjectCache& object_cache) {
  auto cached_buffer = object_cache.getObject(nullptr);
  if (cached_buffer) {
    auto error = lljit_->addObjectFile(std::move(cached_buffer));
    if (error) {
      return Status::CodeGenError("Failed to add cached object file to LLJIT: ",
                                  llvm::toString(std::move(error)));
    }
  }
  return Status::OK();
}

在PassManager里面可以挂上Optimize

static void OptimizeModuleWithNewPassManager(llvm::Module& module,
                                             llvm::TargetIRAnalysis target_analysis) {
  // Setup an optimiser pipeline
  llvm::PassBuilder pass_builder;
  llvm::LoopAnalysisManager loop_am;
  llvm::FunctionAnalysisManager function_am;
  llvm::CGSCCAnalysisManager cgscc_am;
  llvm::ModuleAnalysisManager module_am;
  function_am.registerPass([&] { return target_analysis; });
  // Register required analysis managers
  pass_builder.registerModuleAnalyses(module_am);
  pass_builder.registerCGSCCAnalyses(cgscc_am);
  pass_builder.registerFunctionAnalyses(function_am);
  pass_builder.registerLoopAnalyses(loop_am);
  pass_builder.crossRegisterProxies(loop_am, function_am, cgscc_am, module_am);
  pass_builder.registerPipelineStartEPCallback([&](llvm::ModulePassManager& module_pm,
                                                   llvm::OptimizationLevel Level) {
    module_pm.addPass(llvm::ModuleInlinerPass());
    llvm::FunctionPassManager function_pm;
    function_pm.addPass(llvm::InstCombinePass());
    function_pm.addPass(llvm::PromotePass());
    function_pm.addPass(llvm::GVNPass());
    function_pm.addPass(llvm::NewGVNPass());
    function_pm.addPass(llvm::SimplifyCFGPass());
    function_pm.addPass(llvm::LoopVectorizePass());
    function_pm.addPass(llvm::SLPVectorizerPass());
    module_pm.addPass(llvm::createModuleToFunctionPassAdaptor(std::move(function_pm)));
    module_pm.addPass(llvm::GlobalOptPass());
  });

engine

关于LLVM Engine的配置基本都在engine.h，engine.cc，engine_llvm_test.cc里面，还可以加载预编译好LLVM IR

 /// load pre-compiled IR modules from precompiled_bitcode.cc and merge them into
  /// the main module.
  Status LoadPreCompiledIR();
  // load external pre-compiled bitcodes into module
  Status LoadExternalPreCompiledIR();
  // Create and add mappings for cpp functions that can be accessed from LLVM.
  arrow::Status AddGlobalMappings();
  // Remove unused functions to reduce compile time.
  Status RemoveUnusedFunctions();
  std::unique_ptr<llvm::LLVMContext> context_;
  std::unique_ptr<llvm::orc::LLJIT> lljit_;
  std::unique_ptr<llvm::IRBuilder<>> ir_builder_;
  std::unique_ptr<llvm::Module> module_;
  LLVMTypes types_;
  std::vector<std::string> functions_to_compile_;
  bool optimize_ = true;
  bool module_finalized_ = false;
  bool cached_;
  bool functions_loaded_ = false;
  std::shared_ptr<FunctionRegistry> function_registry_;
  std::string module_ir_;
  std::unique_ptr<llvm::TargetMachine> target_machine_;
  const std::shared_ptr<Configuration> conf_;
};

encrypt

Gandiva里面有加密套件的相关设置（但是却没看到文档关于如何使用的），其使用的AES加密也来自OpenSSL组件

GANDIVA_EXPORT
int32_t aes_encrypt(const char* plaintext, int32_t plaintext_len, const char* key,
                    int32_t key_len, unsigned char* cipher);
/**
 * Decrypt data using aes algorithm
 **/
GANDIVA_EXPORT
int32_t aes_decrypt(const char* ciphertext, int32_t ciphertext_len, const char* key,
                    int32_t key_len, unsigned char* plaintext);

具体的Test

TEST(TestShaEncryptUtils, TestAesEncryptDecrypt) {
  // 16 bytes key
  auto* key = "12345678abcdefgh";
  auto* to_encrypt = "some test string";
  auto key_len = static_cast<int32_t>(strlen(reinterpret_cast<const char*>(key)));
  auto to_encrypt_len =
      static_cast<int32_t>(strlen(reinterpret_cast<const char*>(to_encrypt)));
  unsigned char cipher_1[64];
  int32_t cipher_1_len =
      gandiva::aes_encrypt(to_encrypt, to_encrypt_len, key, key_len, cipher_1);
  unsigned char decrypted_1[64];
  int32_t decrypted_1_len = gandiva::aes_decrypt(reinterpret_cast<const char*>(cipher_1),
                                                 cipher_1_len, key, key_len, decrypted_1);
  EXPECT_EQ(std::string(reinterpret_cast<const char*>(to_encrypt), to_encrypt_len),
            std::string(reinterpret_cast<const char*>(decrypted_1), decrypted_1_len));

decimal_ir

对于浮点数代码的生成进行了特别的处理，看来这里面坑不小😂

class DecimalIR : public FunctionIRBuilder {
 public:
  explicit DecimalIR(Engine* engine)
      : FunctionIRBuilder(engine), enable_ir_traces_(false) {}
  /// Build decimal IR functions and add them to the engine.
  static Status AddFunctions(Engine* engine);
  void EnableTraces() { enable_ir_traces_ = true; }
  llvm::Value* CallDecimalFunction(const std::string& function_name,
                                   llvm::Type* return_type,
                                   const std::vector<llvm::Value*>& args);
 private:
  /// The intrinsic fn for divide with small divisors is about 10x slower, so not
  /// using these.
  static const bool kUseOverflowIntrinsics = false;
  // Holder for an i128 value, along with its with scale and precision.
  class ValueFull {
   public:
    ValueFull(llvm::Value* value, llvm::Value* precision, llvm::Value* scale)
        : value_(value), precision_(precision), scale_(scale) {}
    llvm::Value* value() const { return value_; }
    llvm::Value* precision() const { return precision_; }
    llvm::Value* scale() const { return scale_; }
   private:
    llvm::Value* value_;
    llvm::Value* precision_;
    llvm::Value* scale_;
  };
  // Holder for an i128 value, and a boolean indicating overflow.
  class ValueWithOverflow {
   public:
    ValueWithOverflow(llvm::Value* value, llvm::Value* overflow)
        : value_(value), overflow_(overflow) {}
    // Make from IR struct
    static ValueWithOverflow MakeFromStruct(DecimalIR* decimal_ir, llvm::Value* dstruct);
    // Build a corresponding IR struct
    llvm::Value* AsStruct(DecimalIR* decimal_ir) const;
    llvm::Value* value() const { return value_; }
    llvm::Value* overflow() const { return overflow_; }
   private:
    llvm::Value* value_;
    llvm::Value* overflow_;
  };

附录：Arrow类型与LLVM类型的映射

Gandiva 类型（arrow 数据类型）	C 函数类型
int8	int8_t
int16	int16_t
int32	int32_t
int64	int64_t
uint8	uint8_t
uint16	uint16_t
uint32	uint32_t
uint64	uint64_t
float32	float
float64	double
boolean	bool
date32	int32_t
date64	int64_t
timestamp	int64_t
time32	int32_t
time64	int64_t
interval_month	int32_t
interval_day_time	int64_t
utf8（作为参数类型）	const char*、uint32_t
utf8（作为返回类型）	int64_t context、const char、uint32_t
binary（作为参数类型）	const char*、uint32_t
utf8（作为返回类型）	int64_t context、const char、uint32_t

总结

抛开不知道为什么项目文件不区分文件夹的问题，项目代码质量很高，关键点的注释和测试样例可以让人理解Gandiva做的事情，很有意思。

虽然有提供Gandiva外部函数的相关手册，但具体怎么用的话，还是要是要看测试样例

项目历史&现状简述

虽然网传这个项目烂尾（根本就没这回事好吧😅），但事实是Gandiva一直都有commit进行维护，今年LLVM20出来以后也很快做了跟进

目前Gandiva有C和C++的相关库，但对于Rust版本的Arrow似乎就不提供相关支持了：Interfaces for gandiva bindings.

源码解析

代码下载于2025.6.24，所有代码均平铺在单层目录上

Gandiva源码的地址：https://github.com/apache/arrow/tree/main/cpp/src/gandiva

|-- CMakeLists.txt
|-- GandivaConfig.cmake.in
|-- annotator.cc
|-- annotator.h
|-- annotator_test.cc
|-- arrow.h
|-- basic_decimal_scalar.h
|-- bitmap_accumulator.cc
|-- bitmap_accumulator.h
|-- bitmap_accumulator_test.cc
|-- cache.cc
|-- cache.h
|-- cache_test.cc
|-- cast_time.cc
|-- compiled_expr.h
|-- condition.h
|-- configuration.cc
|-- configuration.h
|-- context_helper.cc
|-- date_utils.cc
|-- date_utils.h
|-- decimal_ir.cc
|-- decimal_ir.h
|-- decimal_scalar.h
|-- decimal_type_util.cc
|-- decimal_type_util.h
|-- decimal_type_util_test.cc
|-- decimal_xlarge.cc
|-- decimal_xlarge.h
|-- dex.h
|-- dex_visitor.h
|-- encrypt_utils.cc
|-- encrypt_utils.h
|-- encrypt_utils_test.cc
|-- engine.cc
|-- engine.h
|-- engine_llvm_test.cc
|-- eval_batch.h
|-- execution_context.h
|-- exported_funcs.cc
|-- exported_funcs.h
|-- exported_funcs_registry.cc
|-- exported_funcs_registry.h
|-- exported_funcs_registry_test.cc
|-- expr_decomposer.cc
|-- expr_decomposer.h
|-- expr_decomposer_test.cc
|-- expr_validator.cc
|-- expr_validator.h
|-- expression.cc
|-- expression.h
|-- expression_cache_key.h
|-- expression_registry.cc
|-- expression_registry.h
|-- expression_registry_test.cc
|-- external_c_functions.cc
|-- field_descriptor.h
|-- filter.cc
|-- filter.h
|-- formatting_utils.h
|-- func_descriptor.h
|-- function_holder.h
|-- function_holder_maker_registry.cc
|-- function_holder_maker_registry.h
|-- function_ir_builder.cc
|-- function_ir_builder.h
|-- function_registry.cc
|-- function_registry.h
|-- function_registry_arithmetic.cc
|-- function_registry_arithmetic.h
|-- function_registry_common.h
|-- function_registry_datetime.cc
|-- function_registry_datetime.h
|-- function_registry_hash.cc
|-- function_registry_hash.h
|-- function_registry_math_ops.cc
|-- function_registry_math_ops.h
|-- function_registry_string.cc
|-- function_registry_string.h
|-- function_registry_test.cc
|-- function_registry_timestamp_arithmetic.cc
|-- function_registry_timestamp_arithmetic.h
|-- function_signature.cc
|-- function_signature.h
|-- function_signature_test.cc
|-- gandiva.pc.in
|-- gandiva_aliases.h
|-- gandiva_object_cache.cc
|-- gandiva_object_cache.h
|-- gdv_function_stubs.cc
|-- gdv_function_stubs.h
|-- gdv_function_stubs_test.cc
|-- gdv_hash_function_stubs.cc
|-- gdv_string_function_stubs.cc
|-- hash_utils.cc
|-- hash_utils.h
|-- hash_utils_test.cc
|-- in_holder.h
|-- interval_holder.cc
|-- interval_holder.h
|-- interval_holder_test.cc
|-- literal_holder.cc
|-- literal_holder.h
|-- llvm_generator.cc
|-- llvm_generator.h
|-- llvm_generator_test.cc
|-- llvm_includes.h
|-- llvm_types.cc
|-- llvm_types.h
|-- llvm_types_test.cc
|-- local_bitmaps_holder.h
|-- lru_cache.h
|-- lru_cache_test.cc
|-- lvalue.h
|-- make_precompiled_bitcode.py
|-- native_function.h
|-- node.h
|-- node_visitor.h
|-- precompiled
|   |-- CMakeLists.txt
|   |-- arithmetic_ops.cc
|   |-- arithmetic_ops_test.cc
|   |-- bitmap.cc
|   |-- bitmap_test.cc
|   |-- decimal_ops.cc
|   |-- decimal_ops.h
|   |-- decimal_ops_test.cc
|   |-- decimal_wrapper.cc
|   |-- epoch_time_point.h
|   |-- epoch_time_point_test.cc
|   |-- extended_math_ops.cc
|   |-- extended_math_ops_test.cc
|   |-- hash.cc
|   |-- hash_test.cc
|   |-- print.cc
|   |-- string_ops.cc
|   |-- string_ops_test.cc
|   |-- testing.h
|   |-- time.cc
|   |-- time_constants.h
|   |-- time_fields.h
|   |-- time_test.cc
|   |-- timestamp_arithmetic.cc
|   `-- types.h
|-- precompiled_bitcode.cc.in
|-- projector.cc
|-- projector.h
|-- random_generator_holder.cc
|-- random_generator_holder.h
|-- random_generator_holder_test.cc
|-- regex_functions_holder.cc
|-- regex_functions_holder.h
|-- regex_functions_holder_test.cc
|-- regex_util.cc
|-- regex_util.h
|-- selection_vector.cc
|-- selection_vector.h
|-- selection_vector_impl.h
|-- selection_vector_test.cc
|-- simple_arena.h
|-- simple_arena_test.cc
|-- symbols.map
|-- tests
|   |-- CMakeLists.txt
|   |-- binary_test.cc
|   |-- boolean_expr_test.cc
|   |-- date_time_test.cc
|   |-- decimal_single_test.cc
|   |-- decimal_test.cc
|   |-- external_functions
|   |   |-- CMakeLists.txt
|   |   |-- multiply_by_two.cc
|   |   `-- multiply_by_two.h
|   |-- filter_project_test.cc
|   |-- filter_test.cc
|   |-- generate_data.h
|   |-- hash_test.cc
|   |-- huge_table_test.cc
|   |-- if_expr_test.cc
|   |-- in_expr_test.cc
|   |-- literal_test.cc
|   |-- micro_benchmarks.cc
|   |-- null_validity_test.cc
|   |-- projector_build_validation_test.cc
|   |-- projector_test.cc
|   |-- test_util.cc
|   |-- test_util.h
|   |-- timed_evaluate.h
|   |-- to_string_test.cc
|   `-- utf8_test.cc
|-- to_date_holder.cc
|-- to_date_holder.h
|-- to_date_holder_test.cc
|-- tree_expr_builder.cc
|-- tree_expr_builder.h
|-- tree_expr_test.cc
|-- value_validity_pair.h
`-- visibility.h

由于代码量极大，只选取部分进行分析

node

关于Tree的Node的定义

namespace gandiva {
class FieldNode;
class FunctionNode;
class IfNode;
class LiteralNode;
class BooleanNode;
template <typename Type>
class InExpressionNode;
/// \brief Visitor for nodes in the expression tree.
class GANDIVA_EXPORT NodeVisitor {
 public:
  virtual ~NodeVisitor() = default;
  virtual Status Visit(const FieldNode& node) = 0;
  virtual Status Visit(const FunctionNode& node) = 0;
  virtual Status Visit(const IfNode& node) = 0;
  virtual Status Visit(const LiteralNode& node) = 0;
  virtual Status Visit(const BooleanNode& node) = 0;
  virtual Status Visit(const InExpressionNode<int32_t>& node) = 0;
  virtual Status Visit(const InExpressionNode<int64_t>& node) = 0;
  virtual Status Visit(const InExpressionNode<float>& node) = 0;
  virtual Status Visit(const InExpressionNode<double>& node) = 0;
  virtual Status Visit(const InExpressionNode<gandiva::DecimalScalar128>& node) = 0;
  virtual Status Visit(const InExpressionNode<std::string>& node) = 0;
};
}  // namespace gandiva

tree_expr

tree_expr_test.cc
tree_expr_builder.cc
tree_expr_builder.h

用于解析计算树，比如4*5+3这种，通过TreeExprBuilder完成树的构建

TEST_F(TestExprTree, TestField) {
  Annotator annotator;
  auto n0 = TreeExprBuilder::MakeField(i0_);
  EXPECT_EQ(n0->return_type(), int32());
  auto n1 = TreeExprBuilder::MakeField(b0_);
  EXPECT_EQ(n1->return_type(), boolean());
  ExprDecomposer decomposer(*registry_, annotator);
  ValueValidityPairPtr pair;
  auto status = decomposer.Decompose(*n1, &pair);
  DCHECK_EQ(status.ok(), true) << status.message();
  auto value = pair->value_expr();
  auto value_dex = std::dynamic_pointer_cast<VectorReadFixedLenValueDex>(value);
  EXPECT_EQ(value_dex->FieldType(), boolean());
  EXPECT_EQ(pair->validity_exprs().size(), 1);
  auto validity = pair->validity_exprs().at(0);
  auto validity_dex = std::dynamic_pointer_cast<VectorReadValidityDex>(validity);
  EXPECT_NE(validity_dex->ValidityIdx(), value_dex->DataIdx());
}

借助函数重载,使用访问者模式，实现树的遍历与转换

class GANDIVA_EXPORT TreeExprBuilder {
 public:
  /// \brief create a node on a literal.
  static NodePtr MakeLiteral(bool value);
  static NodePtr MakeLiteral(uint8_t value);
  static NodePtr MakeLiteral(uint16_t value);
  static NodePtr MakeLiteral(uint32_t value);
  static NodePtr MakeLiteral(uint64_t value);
  static NodePtr MakeLiteral(int8_t value);
  static NodePtr MakeLiteral(int16_t value);
  static NodePtr MakeLiteral(int32_t value);
  static NodePtr MakeLiteral(int64_t value);
  static NodePtr MakeLiteral(float value);
  static NodePtr MakeLiteral(double value);
  static NodePtr MakeStringLiteral(const std::string& value);
  static NodePtr MakeBinaryLiteral(const std::string& value);
  static NodePtr MakeDecimalLiteral(const DecimalScalar128& value);

to_date_holder

完成字符串往时间的转化

EST_F(TestToDateHolder, TestSimpleDateTime) {
  EXPECT_OK_AND_ASSIGN(auto to_date_holder, ToDateHolder::Make("YYYY-MM-DD HH:MI:SS", 1));
  auto& to_date = *to_date_holder;
  bool out_valid;
  std::string s("1986-12-01 01:01:01");
  int64_t millis_since_epoch =
      to_date(&execution_context_, s.data(), (int)s.length(), true, &out_valid);
  EXPECT_EQ(millis_since_epoch, 533779200000);
  s = std::string("1986-12-01 01:01:01.11");
  millis_since_epoch =
      to_date(&execution_context_, s.data(), (int)s.length(), true, &out_valid);
  EXPECT_EQ(millis_since_epoch, 533779200000);
  s = std::string("1986-12-01 01:01:01 +0800");
  millis_since_epoch =
      to_date(&execution_context_, s.data(), (int)s.length(), true, &out_valid);
  EXPECT_EQ(millis_since_epoch, 533779200000);
#if 0
  // TODO : this fails parsing with date::parse and strptime on linux
  s = std::string("1886-12-01 00:00:00");
  millis_since_epoch =
      to_date(&execution_context_, s.data(), (int) s.length(), true, &out_valid);
  EXPECT_EQ(out_valid, true);
  EXPECT_EQ(millis_since_epoch, -2621894400000);
#endif
  s = std::string("1886-12-01 01:01:01");
  millis_since_epoch =
      to_date(&execution_context_, s.data(), (int)s.length(), true, &out_valid);
  EXPECT_EQ(millis_since_epoch, -2621894400000);
  s = std::string("1986-12-11 01:30:00");
  millis_since_epoch =
      to_date(&execution_context_, s.data(), (int)s.length(), true, &out_valid);
  EXPECT_EQ(millis_since_epoch, 534643200000);
}

simple_arena

没太理解内容，似乎是关于内存分配处理的内容，实现以Trunk为单位的内存分配

TEST_F(TestSimpleArena, TestAlloc) {
  int64_t chunk_size = 4096;
  SimpleArena arena(arrow::default_memory_pool(), chunk_size);
  // Small allocations should come from the same chunk.
  int64_t small_size = 100;
  for (int64_t i = 0; i < 20; ++i) {
    auto p = arena.Allocate(small_size);
    EXPECT_NE(p, nullptr);
    EXPECT_EQ(arena.total_bytes(), chunk_size);
    EXPECT_EQ(arena.avail_bytes(), chunk_size - (i + 1) * small_size);
  }
  // large allocations require separate chunks
  int64_t large_size = 100 * chunk_size;
  auto p = arena.Allocate(large_size);
  EXPECT_NE(p, nullptr);
  EXPECT_EQ(arena.total_bytes(), chunk_size + large_size);
  EXPECT_EQ(arena.avail_bytes(), 0);
}

selection_vector

实现对于Arrow格式存储的选择向量（Selection Vector）

这里需要补充下关于选择向量的相关知识

Selection Vector 是一种在数据处理系统中使用的技术，用来表示一批数据中哪些行被选中（有效），从而避免对不相关的数据行进行操作。它常见于列式数据库、矢量化执行引擎（如 Apache Arrow、Dremio、Gandiva）中，用于提升性能。

Selection Vector（选择向量）本质上是一个索引数组，存储的是被选中行在原始数据批中的下标。

避免复制数据：只需操作向量而不移动原始数据。

高效过滤：可以快速跳过不符合条件的行。

矢量化执行支持：配合批处理（batch processing），提升 SIMD 性能。

落到具体选择上，可能就是bitmap或是个Set

TEST_F(TestSelectionVector, TestInt16Set) {
  int max_slots = 10;
  std::shared_ptr<SelectionVector> selection;
  auto status = SelectionVector::MakeInt16(max_slots, pool_, &selection);
  EXPECT_EQ(status.ok(), true) << status.message();
  selection->SetIndex(0, 100);
  EXPECT_EQ(selection->GetIndex(0), 100);
  selection->SetIndex(1, 200);
  EXPECT_EQ(selection->GetIndex(1), 200);
  selection->SetNumSlots(2);
  EXPECT_EQ(selection->GetNumSlots(), 2);
  // TopArray() should return an array with 100,200
  auto array_raw = selection->ToArray();
  const auto& array = dynamic_cast<const arrow::UInt16Array&>(*array_raw);
  EXPECT_EQ(array.length(), 2) << array_raw->ToString();
  EXPECT_EQ(array.Value(0), 100) << array_raw->ToString();
  EXPECT_EQ(array.Value(1), 200) << array_raw->ToString();
}

也可以通过Bitmap实现向量选择

TEST_F(TestSelectionVector, TestInt64PopulateFromBitMap) {
  int max_slots = 200;
  std::shared_ptr<SelectionVector> selection;
  auto status = SelectionVector::MakeInt64(max_slots, pool_, &selection);
  EXPECT_EQ(status.ok(), true) << status.message();
  int bitmap_size = RoundUpNumi64(max_slots) * 8;
  std::vector<uint8_t> bitmap(bitmap_size);
  arrow::bit_util::SetBit(&bitmap[0], 0);
  arrow::bit_util::SetBit(&bitmap[0], 5);
  arrow::bit_util::SetBit(&bitmap[0], 121);
  arrow::bit_util::SetBit(&bitmap[0], 220);
  status = selection->PopulateFromBitMap(&bitmap[0], bitmap_size, max_slots - 1);
  EXPECT_EQ(status.ok(), true) << status.message();
  EXPECT_EQ(selection->GetNumSlots(), 3);
  EXPECT_EQ(selection->GetIndex(0), 0);
  EXPECT_EQ(selection->GetIndex(1), 5);
  EXPECT_EQ(selection->GetIndex(2), 121);
}

regex_functions/util

正则表达式相关，似乎能检测SQL相关的符号，这部分使用了Google的re2库，参考PCRE（Perl Compatible Regular Expressions）实现标准

const std::set<char> RegexUtil::pcre_regex_specials_ = {
    '[', ']', '(', ')', '|', '^', '-', '+', '*', '?', '{', '}', '$', '\\', '.'};

而测试也基本围绕些简易字符串展开

你甚至能看到关于中文字符的检测，这可太稀罕了，C++的UTF-8识别这块我一直摸不着头脑😂

  input_string = "路%c$大";
  extract_index = 2;  // Retrieve all matched string
  ret = extract_numbers(&execution_context_, input_string.c_str(),
                        static_cast<int32_t>(input_string.length()), extract_index,
                        &out_length);
  ret_as_str = std::string(ret, out_length);
  EXPECT_EQ(out_length, 1);
  EXPECT_EQ(ret_as_str, "c");

random_generator

随机数生成器，里面包含了随机种子信息

namespace gandiva {
/// Function Holder for 'random'
class GANDIVA_EXPORT RandomGeneratorHolder : public FunctionHolder {
 public:
  ~RandomGeneratorHolder() override = default;
  static Result<std::shared_ptr<RandomGeneratorHolder>> Make(const FunctionNode& node);
  double operator()() { return distribution_(generator_); }
 private:
  explicit RandomGeneratorHolder(int seed) : distribution_(0, 1) {
    int64_t seed64 = static_cast<int64_t>(seed);
    seed64 = (seed64 ^ 0x00000005DEECE66D) & 0x0000ffffffffffff;
    generator_.seed(static_cast<uint64_t>(seed64));
  }
  RandomGeneratorHolder() : distribution_(0, 1) {
    generator_.seed(::arrow::internal::GetRandomSeed());
  }
  std::mt19937_64 generator_;
  std::uniform_real_distribution<> distribution_;
};
}  // namespace gandiva

project

关于Gandiva如何处理Apache Arrow的Project的代码了，

/// \brief projection using expressions.

///

/// A projector is built for a specific schema and vector of expressions.

/// Once the projector is built, it can be used to evaluate many row batches.

看以看到实现中LLVM Generator，output_fields，是否使用已有的缓存，以及代码生成设置相关属性

  std::unique_ptr<LLVMGenerator> llvm_generator_;
  SchemaPtr schema_;
  FieldVector output_fields_;
  std::shared_ptr<Configuration> configuration_;
  bool built_from_cache_;
};

这里面还涉及了关于数据缓冲区的代码

Status Projector::AllocArrayData(const DataTypePtr& type, int64_t num_records,
                                 arrow::MemoryPool* pool,
                                 ArrayDataPtr* array_data) const {
  arrow::Status astatus;
  std::vector<std::shared_ptr<arrow::Buffer>> buffers;
  // The output vector always has a null bitmap.
  int64_t size = arrow::bit_util::BytesForBits(num_records);
  ARROW_ASSIGN_OR_RAISE(auto bitmap_buffer, arrow::AllocateBuffer(size, pool));
  buffers.push_back(std::move(bitmap_buffer));
  // String/Binary vectors have an offsets array.
  auto type_id = type->id();
  if (arrow::is_binary_like(type_id)) {
    auto offsets_len = arrow::bit_util::BytesForBits((num_records + 1) * 32);
    ARROW_ASSIGN_OR_RAISE(auto offsets_buffer, arrow::AllocateBuffer(offsets_len, pool));
    buffers.push_back(std::move(offsets_buffer));
  }
  // The output vector always has a data array.
  int64_t data_len;
  if (arrow::is_primitive(type_id) || type_id == arrow::Type::DECIMAL) {
    const auto& fw_type = static_cast<const arrow::FixedWidthType&>(*type);
    data_len = arrow::bit_util::BytesForBits(num_records * fw_type.bit_width());
  } else if (arrow::is_binary_like(type_id)) {
    // we don't know the expected size for varlen output vectors.
    data_len = 0;
  } else {
    return Status::Invalid("Unsupported output data type " + type->ToString());
  }
  ARROW_ASSIGN_OR_RAISE(auto data_buffer, arrow::AllocateResizableBuffer(data_len, pool));
  // This is not strictly required but valgrind gets confused and detects this
  // as uninitialized memory access. See arrow::util::SetBitTo().
  if (type->id() == arrow::Type::BOOL) {
    memset(data_buffer->mutable_data(), 0, data_len);
  }
  buffers.push_back(std::move(data_buffer));
  *array_data = arrow::ArrayData::Make(type, num_records, std::move(buffers));
  return Status::OK();
}

有点奇怪的是这部分内容没有没有配备test

lru_cache

从Boost库修改的LRU Cache，因为代码使用了模板，所以这里看不出来是存了什么

// modified from boost LRU cache -> the boost cache supported only an
// ordered map.
namespace gandiva {
// a cache which evicts the least recently used item when it is full
template <class Key, class Value>
class LruCache {
 public:
  using key_type = Key;
  using value_type = Value;
  using list_type = std::list<key_type>;

测试代码是直接使用string

TEST_F(TestLruCache, TestLruBehavior) {
  cache_.insert(TestCacheKey(1), "hello");
  cache_.insert(TestCacheKey(2), "hello");
  cache_.get(TestCacheKey(1));
  cache_.insert(TestCacheKey(3), "hello");
  // should have evicted key 2.
  ASSERT_EQ(*cache_.get(TestCacheKey(1)), "hello");
}

llvm_types

有一个llvm_types用于全局的types生成管理，用于映射Arrow的类型，这样的代码也能在NoisePage里面找到

class GANDIVA_EXPORT LLVMTypes {
 public:
  explicit LLVMTypes(llvm::LLVMContext& context);
  llvm::Type* void_type() { return llvm::Type::getVoidTy(context_); }
  llvm::Type* i1_type() { return llvm::Type::getInt1Ty(context_); }
  llvm::Type* i8_type() { return llvm::Type::getInt8Ty(context_); }
  llvm::Type* i16_type() { return llvm::Type::getInt16Ty(context_); }
  llvm::Type* i32_type() { return llvm::Type::getInt32Ty(context_); }
  llvm::Type* i64_type() { return llvm::Type::getInt64Ty(context_); }
  llvm::Type* i128_type() { return llvm::Type::getInt128Ty(context_); }
  llvm::StructType* i128_split_type() {
    // struct with high/low bits (see decimal_ops.cc:DecimalSplit)
    return llvm::StructType::get(context_, {i64_type(), i64_type()}, false);
  }

以及一些简单的内容初始化

  llvm::Constant* i128_zero() { return i128_constant(0); }
  llvm::Constant* i128_one() { return i128_constant(1); }

llvm_includes

开头的关闭MSVC的警告可以记录以下，这是我头一回遇到，看以看出Gandiva是能在Windows上面运行的

#if defined(_MSC_VER)
#  pragma warning(push)
#  pragma warning(disable : 4141)
#  pragma warning(disable : 4146)
#  pragma warning(disable : 4244)
#  pragma warning(disable : 4267)
#  pragma warning(disable : 4291)
#  pragma warning(disable : 4624)
#endif

甚至还考虑到了不同LLVM版本的情况

#if LLVM_VERSION_MAJOR >= 10
#  define LLVM_ALIGN(alignment) (llvm::Align((alignment)))
#else
#  define LLVM_ALIGN(alignment) (alignment)
#endif

llvm_generator

最为核心的LLVM代码生成

生成器似乎可以对缓存有效利用

class GANDIVA_EXPORT LLVMGenerator {
 public:
  /// \brief Factory method to initialize the generator.
  static Result<std::unique_ptr<LLVMGenerator>> Make(
      const std::shared_ptr<Configuration>& config, bool cached,
      std::optional<std::reference_wrapper<GandivaObjectCache>> object_cache =
          std::nullopt);
  /// \brief Get the cache to be used for LLVM ObjectCache.
  static std::shared_ptr<Cache<ExpressionCacheKey, std::shared_ptr<llvm::MemoryBuffer>>>
  GetCache();

存储关于SelectionVector：：Mode的信息

SelectionVector::Mode selection_vector_mode() { return selection_vector_mode_; }

build将表达式输入生成代码

  /// \brief Build the code for the expression trees for default mode with a LLVM
  /// ObjectCache. Each element in the vector represents an expression tree
  Status Build(const ExpressionVector& exprs, SelectionVector::Mode mode);
  /// \brief Build the code for the expression trees for default mode. Each
  /// element in the vector represents an expression tree
  Status Build(const ExpressionVector& exprs);

execute将Arrow量输入LLVM IR函数

  /// \brief Execute the built expression against the provided arguments for
  /// default mode.
  Status Execute(const arrow::RecordBatch& record_batch,
                 const ArrayDataVector& output_vector) const;
  /// \brief Execute the built expression against the provided arguments for
  /// all modes. Only works on the records specified in the selection_vector.
  Status Execute(const arrow::RecordBatch& record_batch,
                 const SelectionVector* selection_vector,
                 const ArrayDataVector& output_vector) const;

基本LLVMContext和IRbuilder自然是少不了，但这里的创建Global String居然不用检查重复，不知道是疏忽，还是因为前边有检查😂

  llvm::LLVMContext* context() { return engine_->context(); }
  llvm::IRBuilder<>* ir_builder() { return engine_->ir_builder(); }
  llvm::Constant* CreateGlobalStringPtr(const std::string& string) {
    return engine_->CreateGlobalStringPtr(string);
  }

然后Vistor模式重新过一遍解析树

 class Visitor : public DexVisitor {
   public:
    Visitor(LLVMGenerator* generator, llvm::Function* function,
            llvm::BasicBlock* entry_block, llvm::Value* arg_addrs,
            llvm::Value* arg_local_bitmaps, llvm::Value* arg_holder_ptrs,
            std::vector<llvm::Value*> slice_offsets, llvm::Value* arg_context_ptr,
            llvm::Value* loop_var);
    void Visit(const VectorReadValidityDex& dex) override;
    void Visit(const VectorReadFixedLenValueDex& dex) override;
    void Visit(const VectorReadVarLenValueDex& dex) override;
    void Visit(const LocalBitMapValidityDex& dex) override;
    void Visit(const TrueDex& dex) override;
    void Visit(const FalseDex& dex) override;
    void Visit(const LiteralDex& dex) override;
    void Visit(const NonNullableFuncDex& dex) override;
    void Visit(const NullableNeverFuncDex& dex) override;
    void Visit(const NullableInternalFuncDex& dex) override;
    void Visit(const IfDex& dex) override;
    void Visit(const BooleanAndDex& dex) override;
    void Visit(const BooleanOrDex& dex) override;
    void Visit(const InExprDexBase<int32_t>& dex) override;
    void Visit(const InExprDexBase<int64_t>& dex) override;
    void Visit(const InExprDexBase<float>& dex) override;
    void Visit(const InExprDexBase<double>& dex) override;
    void Visit(const InExprDexBase<gandiva::DecimalScalar128>& dex) override;
    void Visit(const InExprDexBase<std::string>& dex) override;
    template <typename Type>
    void VisitInExpression(const InExprDexBase<Type>& dex);
    LValuePtr result() { return result_; }
    bool has_arena_allocs() { return has_arena_allocs_; }

还有专门关于LLVM函数生成与函数调用的函数

    std::vector<llvm::Value*> BuildParams(int holder_idx,
                                          const ValueValidityPairVector& args,
                                          bool with_validity, bool with_context);
    // Generate code to invoke a function call.
    LValuePtr BuildFunctionCall(const NativeFunction* func, DataTypePtr arrow_return_type,
                                std::vector<llvm::Value*>* params);
    // Generate code for an if-else condition.
    LValuePtr BuildIfElse(llvm::Value* condition, std::function<LValuePtr()> then_func,
                          std::function<LValuePtr()> else_func,
                          DataTypePtr arrow_return_type);

通过接口添加预定义的LLVM IR函数

  /// Generate code to make a function call (to a pre-compiled IR function) which takes
  /// 'args' and has a return type 'ret_type'.
  llvm::Value* AddFunctionCall(const std::string& full_name, llvm::Type* ret_type,
                               const std::vector<llvm::Value*>& args);

关于Cache的详细实现

std::shared_ptr<Cache<ExpressionCacheKey, std::shared_ptr<llvm::MemoryBuffer>>>
LLVMGenerator::GetCache() {
  static std::shared_ptr<Cache<ExpressionCacheKey, std::shared_ptr<llvm::MemoryBuffer>>>
      shared_cache = std::make_shared<
          Cache<ExpressionCacheKey, std::shared_ptr<llvm::MemoryBuffer>>>();
  return shared_cache;
}
Status LLVMGenerator::SetLLVMObjectCache(GandivaObjectCache& object_cache) {
  return engine_->SetLLVMObjectCache(object_cache);
}

build的部分实现

Status LLVMGenerator::Build(const ExpressionVector& exprs, SelectionVector::Mode mode) {
  selection_vector_mode_ = mode;
  for (auto& expr : exprs) {
    auto output = annotator_.AddOutputFieldDescriptor(expr->result());
    ARROW_RETURN_NOT_OK(Add(expr, output));
  }
  // Compile and inject into the process' memory the generated function.
  ARROW_RETURN_NOT_OK(engine_->FinalizeModule());
  // setup the jit functions for each expression.
  for (auto& compiled_expr : compiled_exprs_) {
    auto fn_name = compiled_expr->GetFunctionName(mode);
    ARROW_ASSIGN_OR_RAISE(auto fn_ptr, engine_->CompiledFunction(fn_name));
    auto jit_fn = reinterpret_cast<EvalFunc>(fn_ptr);
    compiled_expr->SetJITFunction(selection_vector_mode_, jit_fn);
  }
  return Status::OK();
}

这部分的详细内容有空的话值得细看，而关于Test的话，这边给的示范样例是LLVM自动向量化向量加

TEST_F(TestLLVMGenerator, TestAdd) {
  // Setup LLVM generator to do an arithmetic add of two vectors
  ASSERT_OK_AND_ASSIGN(auto generator,
                       LLVMGenerator::Make(TestConfigWithIrDumping(), false));
  Annotator annotator;
  auto field0 = std::make_shared<arrow::Field>("f0", arrow::int32());
  auto desc0 = annotator.CheckAndAddInputFieldDescriptor(field0);
  auto validity_dex0 = std::make_shared<VectorReadValidityDex>(desc0);
  auto value_dex0 = std::make_shared<VectorReadFixedLenValueDex>(desc0);
  auto pair0 = std::make_shared<ValueValidityPair>(validity_dex0, value_dex0);
  auto field1 = std::make_shared<arrow::Field>("f1", arrow::int32());
  auto desc1 = annotator.CheckAndAddInputFieldDescriptor(field1);
  auto validity_dex1 = std::make_shared<VectorReadValidityDex>(desc1);
  auto value_dex1 = std::make_shared<VectorReadFixedLenValueDex>(desc1);
  auto pair1 = std::make_shared<ValueValidityPair>(validity_dex1, value_dex1);
  DataTypeVector params{arrow::int32(), arrow::int32()};
  auto func_desc = std::make_shared<FuncDescriptor>("add", params, arrow::int32());
  FunctionSignature signature(func_desc->name(), func_desc->params(),
                              func_desc->return_type());
  const NativeFunction* native_func =
      generator->function_registry_->LookupSignature(signature);
  std::vector<ValueValidityPairPtr> pairs{pair0, pair1};
  auto func_dex = std::make_shared<NonNullableFuncDex>(
      func_desc, native_func, FunctionHolderPtr(nullptr), -1, pairs);
  auto field_sum = std::make_shared<arrow::Field>("out", arrow::int32());
  auto desc_sum = annotator.CheckAndAddInputFieldDescriptor(field_sum);
  // LLVM 10 doesn't like the expr function name to be the same as the module name when
  // LLJIT is used
  std::string fn_name = "llvm_gen_test_add_expr";
  ASSERT_OK(generator->engine_->LoadFunctionIRs());
  ASSERT_OK(generator->CodeGenExprValue(func_dex, 4, desc_sum, 0, fn_name,
                                        SelectionVector::MODE_NONE));
  ASSERT_OK(generator->engine_->FinalizeModule());
  auto const& ir = generator->engine_->ir();
  EXPECT_THAT(ir, testing::HasSubstr("vector.body"));
  ASSERT_OK_AND_ASSIGN(auto fn_ptr, generator->engine_->CompiledFunction(fn_name));
  ASSERT_TRUE(fn_ptr);
  auto eval_func = reinterpret_cast<EvalFunc>(fn_ptr);
  constexpr size_t kNumRecords = 4;
  std::array<uint32_t, kNumRecords> a0{1, 2, 3, 4};
  std::array<uint32_t, kNumRecords> a1{5, 6, 7, 8};
  uint64_t in_bitmap = 0xffffffffffffffffull;
  std::array<uint32_t, kNumRecords> out{0, 0, 0, 0};
  uint64_t out_bitmap = 0;
  std::array<uint8_t*, 6> addrs{
      reinterpret_cast<uint8_t*>(a0.data()),  reinterpret_cast<uint8_t*>(&in_bitmap),
      reinterpret_cast<uint8_t*>(a1.data()),  reinterpret_cast<uint8_t*>(&in_bitmap),
      reinterpret_cast<uint8_t*>(out.data()), reinterpret_cast<uint8_t*>(&out_bitmap),
  };
  std::array<int64_t, 6> addr_offsets{0, 0, 0, 0, 0, 0};
  eval_func(addrs.data(), addr_offsets.data(), nullptr, nullptr, nullptr,
            0 /* dummy context ptr */, kNumRecords);
  EXPECT_THAT(out, testing::ElementsAre(6, 8, 10, 12));
  EXPECT_EQ(out_bitmap, 0ULL);
}

这一块可以看下GPT的解析

这个 TestAdd 测试用例是在测试 LLVM 代码生成器生成向量加法运算的完整流程。让我详细解释：

测试目标

验证 Gandiva 能否正确生成并执行两个 int32 向量相加的 LLVM 代码。

测试流程分析

1. 初始化 LLVM 生成器
ASSERT_OK_AND_ASSIGN(auto generator, LLVMGenerator::Make(TestConfigWithIrDumping(), false));
创建支持 IR 转储的 LLVM 代码生成器

2. 设置输入向量描述符
// 第一个输入向量 f0 (int32)
auto field0 = std::make_shared<arrow::Field>("f0", arrow::int32());
auto desc0 = annotator.CheckAndAddInputFieldDescriptor(field0);
auto validity_dex0 = std::make_shared<VectorReadValidityDex>(desc0);  // 有效性位图
auto value_dex0 = std::make_shared<VectorReadFixedLenValueDex>(desc0); // 数据值
auto pair0 = std::make_shared<ValueValidityPair>(validity_dex0, value_dex0);
// 第二个输入向量 f1 (int32) - 类似的设置
3. 定义加法函数
DataTypeVector params{arrow::int32(), arrow::int32()};
auto func_desc = std::make_shared<FuncDescriptor>("add", params, arrow::int32());
定义一个接受两个 int32 参数、返回 int32 的加法函数

4. 创建函数表达式
auto func_dex = std::make_shared<NonNullableFuncDex>(
    func_desc, native_func, FunctionHolderPtr(nullptr), -1, pairs);
将输入向量对与加法函数绑定

5. 生成 LLVM 代码
ASSERT_OK(generator->CodeGenExprValue(func_dex, 4, desc_sum, 0, fn_name,
                                      SelectionVector::MODE_NONE));
为表达式生成 LLVM IR 代码

处理 4 条记录

6. 编译并获取函数指针
ASSERT_OK_AND_ASSIGN(auto fn_ptr, generator->engine_->CompiledFunction(fn_name));
auto eval_func = reinterpret_cast<EvalFunc>(fn_ptr);
将 LLVM IR 编译为机器码

获取可执行的函数指针

7. 准备测试数据
std::array<uint32_t, kNumRecords> a0{1, 2, 3, 4};  // 第一个向量
std::array<uint32_t, kNumRecords> a1{5, 6, 7, 8};  // 第二个向量
uint64_t in_bitmap = 0xffffffffffffffffull;        // 全部有效
std::array<uint32_t, kNumRecords> out{0, 0, 0, 0}; // 输出向量
8. 执行生成的函数
eval_func(addrs.data(), addr_offsets.data(), nullptr, nullptr, nullptr,
          0 /* dummy context ptr */, kNumRecords);
调用编译后的函数执行向量加法

9. 验证结果
EXPECT_THAT(out, testing::ElementsAre(6, 8, 10, 12));  // 1+5, 2+6, 3+7, 4+8
EXPECT_EQ(out_bitmap, 0ULL);  // 输出位图应为0（全部有效）
关键技术点

1. 向量化优化验证
EXPECT_THAT(ir, testing::HasSubstr("vector.body"));
检查生成的 LLVM IR 中包含向量化循环体

确保 LLVM 进行了向量化优化

2. 内存布局
std::array<uint8_t*, 6> addrs{
    reinterpret_cast<uint8_t*>(a0.data()),  // 第一个向量数据
    reinterpret_cast<uint8_t*>(&in_bitmap), // 第一个向量有效性位图
    reinterpret_cast<uint8_t*>(a1.data()),  // 第二个向量数据
    reinterpret_cast<uint8_t*>(&in_bitmap), // 第二个向量有效性位图
    reinterpret_cast<uint8_t*>(out.data()), // 输出向量数据
    reinterpret_cast<uint8_t*>(&out_bitmap) // 输出向量有效性位图
};
测试意义

这个测试验证了 Gandiva 的核心功能：

表达式解析：将高级表达式转换为内部表示

代码生成：生成高效的 LLVM IR 代码

编译执行：将 IR 编译为机器码并执行

向量化优化：确保生成的代码利用了 SIMD 指令

正确性：验证计算结果的准确性

这是一个端到端的集成测试，确保整个代码生成和执行流水线工作正常。

这里的testing::HasSubstr是GMock的组件

这里可以看到，C语言函数是可以直接register上去的

TEST_F(TestLLVMGenerator, VerifyExtendedCFunctions) {
  VerifyFunctionMapping("multiply_by_three_int32", [](auto registry) {
    return TestConfigWithCFunction(std::move(registry));
  });
//test_util.cc
std::shared_ptr<Configuration> TestConfigWithCFunction(
    std::shared_ptr<FunctionRegistry> registry) {
  return BuildConfigurationWithRegistry(std::move(registry), [](auto reg) {
    return reg->Register(GetTestExternalCFunction(),
                         reinterpret_cast<void*>(multiply_by_three));
  });
}
static int64_t multiply_by_three(int32_t value) { return value * 3; }

literal_holder

Gandiva 中统一表示和处理各种类型的常量值

namespace gandiva {
using LiteralHolder =
    std::variant<bool, float, double, int8_t, int16_t, int32_t, int64_t, uint8_t,
                 uint16_t, uint32_t, uint64_t, std::string, DecimalScalar128>;
GANDIVA_EXPORT std::string ToString(const LiteralHolder& holder);
}  // namespace gandiva

std::variant 是 C++17 引入的一个类型安全的联合体（type-safe union），它可以在运行时保存一个多个预设类型中的一个值，但不会像传统的 union 那样不安全。

Rust 的 enum 枚举类型是 std::variant 的更强版本

Interval_holder

处理各类时间间隔

  // Pass only years and days to cast
  data = "P12Y15D";
  response = cast_interval_day(&execution_context_, data.data(), 7, true, &out_valid);
  qty_days_in_response = 15;
  qty_millis_in_response = 0;
  EXPECT_TRUE(out_valid);
  EXPECT_FALSE(execution_context_.has_error());
  EXPECT_EQ(response, (qty_millis_in_response << 32) | qty_days_in_response);

hash_utils

hash组件用的是OpenSSL，主要是关于Sha类，Md5l类函数

GANDIVA_EXPORT
const char* gdv_sha512_hash(int64_t context, const void* message, size_t message_length,
                            int32_t* out_length) {
  constexpr int sha512_result_length = 128;
  return gdv_hash_using_openssl(context, message, message_length, EVP_sha512(),
                                sha512_result_length, out_length);
}
/// Hashes a generic message using the SHA256 algorithm
GANDIVA_EXPORT
const char* gdv_sha256_hash(int64_t context, const void* message, size_t message_length,
                            int32_t* out_length) {
  constexpr int sha256_result_length = 64;
  return gdv_hash_using_openssl(context, message, message_length, EVP_sha256(),
                                sha256_result_length, out_length);
}
/// Hashes a generic message using the SHA1 algorithm
GANDIVA_EXPORT
const char* gdv_sha1_hash(int64_t context, const void* message, size_t message_length,
                          int32_t* out_length) {
  constexpr int sha1_result_length = 40;
  return gdv_hash_using_openssl(context, message, message_length, EVP_sha1(),
                                sha1_result_length, out_length);
}
GANDIVA_EXPORT
const char* gdv_md5_hash(int64_t context, const void* message, size_t message_length,
                         int32_t* out_length) {
  constexpr int md5_result_length = 32;
  return gdv_hash_using_openssl(context, message, message_length, EVP_md5(),
                                md5_result_length, out_length);
}

gandiva_object_cache

直接对result1 = evaluate("column1 + column2 * 3");这类操作的结果进行缓存，相关操作继承自llvm::ObjectCache，使用llvm::memorybuffer缓存相关代码

class GandivaObjectCache : public llvm::ObjectCache {
 public:
  explicit GandivaObjectCache(
      std::shared_ptr<Cache<ExpressionCacheKey, std::shared_ptr<llvm::MemoryBuffer>>>&
          cache,
      ExpressionCacheKey key);
  ~GandivaObjectCache() {}
  void notifyObjectCompiled(const llvm::Module* M, llvm::MemoryBufferRef Obj);
  std::unique_ptr<llvm::MemoryBuffer> getObject(const llvm::Module* M);
 private:
  ExpressionCacheKey cache_key_;
  std::shared_ptr<Cache<ExpressionCacheKey, std::shared_ptr<llvm::MemoryBuffer>>> cache_;
};

function_signature

给函数上Hash，我猜应该是缓存记录

  EXPECT_EQ(FunctionSignature("extract_month", {arrow::date32()}, arrow::int64()),
            FunctionSignature("extract_month", {local_date32_type_}, local_i64_type_));
TEST_F(TestFunctionSignature, TestHash) {
  FunctionSignature f1("add", {arrow::int32(), arrow::int32()}, arrow::int64());
  FunctionSignature f2("add", {local_i32_type_, local_i32_type_}, local_i64_type_);
  EXPECT_EQ(f1.Hash(), f2.Hash());
  FunctionSignature f3("extractDay", {arrow::int64()}, arrow::int64());
  FunctionSignature f4("extractday", {arrow::int64()}, arrow::int64());
  EXPECT_EQ(f3.Hash(), f4.Hash());
}

function_register

class GANDIVA_EXPORT FunctionRegistry {
 public:
  using iterator = const NativeFunction*;
  using FunctionHolderMaker =
      std::function<arrow::Result<std::shared_ptr<FunctionHolder>>(
          const FunctionNode& function_node)>;
  FunctionRegistry();
  FunctionRegistry(const FunctionRegistry&) = delete;
  FunctionRegistry& operator=(const FunctionRegistry&) = delete;
  /// Lookup a pre-compiled function by its signature.
  const NativeFunction* LookupSignature(const FunctionSignature& signature) const;
  /// \brief register a set of functions into the function registry from a given bitcode
  /// file
  arrow::Status Register(const std::vector<NativeFunction>& funcs,
                         const std::string& bitcode_path);
  /// \brief register a set of functions into the function registry from a given bitcode
  /// buffer
  arrow::Status Register(const std::vector<NativeFunction>& funcs,
                         std::shared_ptr<arrow::Buffer> bitcode_buffer);
  /// \brief register a C function into the function registry
  /// @param func the registered function's metadata
  /// @param c_function_ptr the function pointer to the
  /// registered function's implementation
  /// @param function_holder_maker this will be used as the function holder if the
  /// function requires a function holder
  arrow::Status Register(
      NativeFunction func, void* c_function_ptr,
      std::optional<FunctionHolderMaker> function_holder_maker = std::nullopt);
  /// \brief get a list of bitcode memory buffers saved in the registry
  const std::vector<std::shared_ptr<arrow::Buffer>>& GetBitcodeBuffers() const;
  /// \brief get a list of C functions saved in the registry
  const std::vector<std::pair<NativeFunction, void*>>& GetCFunctions() const;
  const FunctionHolderMakerRegistry& GetFunctionHolderMakerRegistry() const;
  iterator begin() const;
  iterator end() const;
  iterator back() const;
  friend arrow::Result<std::shared_ptr<FunctionRegistry>> MakeDefaultFunctionRegistry();
 private:
  std::vector<NativeFunction> pc_registry_;
  SignatureMap pc_registry_map_;
  std::vector<std::shared_ptr<arrow::Buffer>> bitcode_memory_buffers_;
  std::vector<std::pair<NativeFunction, void*>> c_functions_;
  FunctionHolderMakerRegistry holder_maker_registry_;
  Status Add(NativeFunction func);
};
/// \brief get the default function registry
GANDIVA_EXPORT std::shared_ptr<FunctionRegistry> default_function_registry();
}  // namespace gandiva

function_ir_builder

一个十分通用的IR生成器（这玩意我怎么之前没想到过呢.jpg)，甚至能实现If-else的block块跳转

class FunctionIRBuilder {
 public:
  explicit FunctionIRBuilder(Engine* engine) : engine_(engine) {}
  virtual ~FunctionIRBuilder() = default;
 protected:
  LLVMTypes* types() { return engine_->types(); }
  llvm::Module* module() { return engine_->module(); }
  llvm::LLVMContext* context() { return engine_->context(); }
  llvm::IRBuilder<>* ir_builder() { return engine_->ir_builder(); }
  llvm::Constant* CreateGlobalStringPtr(const std::string& string) {
    return engine_->CreateGlobalStringPtr(string);
  }
  /// Build an if-else block.
  llvm::Value* BuildIfElse(llvm::Value* condition, llvm::Type* return_type,
                           std::function<llvm::Value*()> then_func,
                           std::function<llvm::Value*()> else_func);
  struct NamedArg {
    std::string name;
    llvm::Type* type;
  };
  /// Build llvm fn.
  llvm::Function* BuildFunction(const std::string& function_name, llvm::Type* return_type,
                                std::vector<NamedArg> in_args);
 private:
  Engine* engine_;
};

filter

这部分也是在LLVM中实现，看起来和Project差不多

 private:
  std::unique_ptr<LLVMGenerator> llvm_generator_;
  SchemaPtr schema_;
  std::shared_ptr<Configuration> configuration_;
  bool built_from_cache_;

如果想要添加缓存，直接SetLLVMObjectCache即可

Status Engine::SetLLVMObjectCache(GandivaObjectCache& object_cache) {
  auto cached_buffer = object_cache.getObject(nullptr);
  if (cached_buffer) {
    auto error = lljit_->addObjectFile(std::move(cached_buffer));
    if (error) {
      return Status::CodeGenError("Failed to add cached object file to LLJIT: ",
                                  llvm::toString(std::move(error)));
    }
  }
  return Status::OK();
}

在PassManager里面可以挂上Optimize

static void OptimizeModuleWithNewPassManager(llvm::Module& module,
                                             llvm::TargetIRAnalysis target_analysis) {
  // Setup an optimiser pipeline
  llvm::PassBuilder pass_builder;
  llvm::LoopAnalysisManager loop_am;
  llvm::FunctionAnalysisManager function_am;
  llvm::CGSCCAnalysisManager cgscc_am;
  llvm::ModuleAnalysisManager module_am;
  function_am.registerPass([&] { return target_analysis; });
  // Register required analysis managers
  pass_builder.registerModuleAnalyses(module_am);
  pass_builder.registerCGSCCAnalyses(cgscc_am);
  pass_builder.registerFunctionAnalyses(function_am);
  pass_builder.registerLoopAnalyses(loop_am);
  pass_builder.crossRegisterProxies(loop_am, function_am, cgscc_am, module_am);
  pass_builder.registerPipelineStartEPCallback([&](llvm::ModulePassManager& module_pm,
                                                   llvm::OptimizationLevel Level) {
    module_pm.addPass(llvm::ModuleInlinerPass());
    llvm::FunctionPassManager function_pm;
    function_pm.addPass(llvm::InstCombinePass());
    function_pm.addPass(llvm::PromotePass());
    function_pm.addPass(llvm::GVNPass());
    function_pm.addPass(llvm::NewGVNPass());
    function_pm.addPass(llvm::SimplifyCFGPass());
    function_pm.addPass(llvm::LoopVectorizePass());
    function_pm.addPass(llvm::SLPVectorizerPass());
    module_pm.addPass(llvm::createModuleToFunctionPassAdaptor(std::move(function_pm)));
    module_pm.addPass(llvm::GlobalOptPass());
  });

engine

关于LLVM Engine的配置基本都在engine.h，engine.cc，engine_llvm_test.cc里面，还可以加载预编译好LLVM IR

 /// load pre-compiled IR modules from precompiled_bitcode.cc and merge them into
  /// the main module.
  Status LoadPreCompiledIR();
  // load external pre-compiled bitcodes into module
  Status LoadExternalPreCompiledIR();
  // Create and add mappings for cpp functions that can be accessed from LLVM.
  arrow::Status AddGlobalMappings();
  // Remove unused functions to reduce compile time.
  Status RemoveUnusedFunctions();
  std::unique_ptr<llvm::LLVMContext> context_;
  std::unique_ptr<llvm::orc::LLJIT> lljit_;
  std::unique_ptr<llvm::IRBuilder<>> ir_builder_;
  std::unique_ptr<llvm::Module> module_;
  LLVMTypes types_;
  std::vector<std::string> functions_to_compile_;
  bool optimize_ = true;
  bool module_finalized_ = false;
  bool cached_;
  bool functions_loaded_ = false;
  std::shared_ptr<FunctionRegistry> function_registry_;
  std::string module_ir_;
  std::unique_ptr<llvm::TargetMachine> target_machine_;
  const std::shared_ptr<Configuration> conf_;
};

encrypt

Gandiva里面有加密套件的相关设置（但是却没看到文档关于如何使用的），其使用的AES加密也来自OpenSSL组件

GANDIVA_EXPORT
int32_t aes_encrypt(const char* plaintext, int32_t plaintext_len, const char* key,
                    int32_t key_len, unsigned char* cipher);
/**
 * Decrypt data using aes algorithm
 **/
GANDIVA_EXPORT
int32_t aes_decrypt(const char* ciphertext, int32_t ciphertext_len, const char* key,
                    int32_t key_len, unsigned char* plaintext);

具体的Test

TEST(TestShaEncryptUtils, TestAesEncryptDecrypt) {
  // 16 bytes key
  auto* key = "12345678abcdefgh";
  auto* to_encrypt = "some test string";
  auto key_len = static_cast<int32_t>(strlen(reinterpret_cast<const char*>(key)));
  auto to_encrypt_len =
      static_cast<int32_t>(strlen(reinterpret_cast<const char*>(to_encrypt)));
  unsigned char cipher_1[64];
  int32_t cipher_1_len =
      gandiva::aes_encrypt(to_encrypt, to_encrypt_len, key, key_len, cipher_1);
  unsigned char decrypted_1[64];
  int32_t decrypted_1_len = gandiva::aes_decrypt(reinterpret_cast<const char*>(cipher_1),
                                                 cipher_1_len, key, key_len, decrypted_1);
  EXPECT_EQ(std::string(reinterpret_cast<const char*>(to_encrypt), to_encrypt_len),
            std::string(reinterpret_cast<const char*>(decrypted_1), decrypted_1_len));

decimal_ir

对于浮点数代码的生成进行了特别的处理，看来这里面坑不小😂

class DecimalIR : public FunctionIRBuilder {
 public:
  explicit DecimalIR(Engine* engine)
      : FunctionIRBuilder(engine), enable_ir_traces_(false) {}
  /// Build decimal IR functions and add them to the engine.
  static Status AddFunctions(Engine* engine);
  void EnableTraces() { enable_ir_traces_ = true; }
  llvm::Value* CallDecimalFunction(const std::string& function_name,
                                   llvm::Type* return_type,
                                   const std::vector<llvm::Value*>& args);
 private:
  /// The intrinsic fn for divide with small divisors is about 10x slower, so not
  /// using these.
  static const bool kUseOverflowIntrinsics = false;
  // Holder for an i128 value, along with its with scale and precision.
  class ValueFull {
   public:
    ValueFull(llvm::Value* value, llvm::Value* precision, llvm::Value* scale)
        : value_(value), precision_(precision), scale_(scale) {}
    llvm::Value* value() const { return value_; }
    llvm::Value* precision() const { return precision_; }
    llvm::Value* scale() const { return scale_; }
   private:
    llvm::Value* value_;
    llvm::Value* precision_;
    llvm::Value* scale_;
  };
  // Holder for an i128 value, and a boolean indicating overflow.
  class ValueWithOverflow {
   public:
    ValueWithOverflow(llvm::Value* value, llvm::Value* overflow)
        : value_(value), overflow_(overflow) {}
    // Make from IR struct
    static ValueWithOverflow MakeFromStruct(DecimalIR* decimal_ir, llvm::Value* dstruct);
    // Build a corresponding IR struct
    llvm::Value* AsStruct(DecimalIR* decimal_ir) const;
    llvm::Value* value() const { return value_; }
    llvm::Value* overflow() const { return overflow_; }
   private:
    llvm::Value* value_;
    llvm::Value* overflow_;
  };

附录：Arrow类型与LLVM类型的映射

Gandiva 类型（arrow 数据类型）	C 函数类型
int8	int8_t
int16	int16_t
int32	int32_t
int64	int64_t
uint8	uint8_t
uint16	uint16_t
uint32	uint32_t
uint64	uint64_t
float32	float
float64	double
boolean	bool
date32	int32_t
date64	int64_t
timestamp	int64_t
time32	int32_t
time64	int64_t
interval_month	int32_t
interval_day_time	int64_t
utf8（作为参数类型）	const char*、uint32_t
utf8（作为返回类型）	int64_t context、const char、uint32_t
binary（作为参数类型）	const char*、uint32_t
utf8（作为返回类型）	int64_t context、const char、uint32_t

总结

抛开不知道为什么项目文件不区分文件夹的问题，项目代码质量很高，关键点的注释和测试样例可以让人理解Gandiva做的事情，很有意思。

虽然有提供Gandiva外部函数的相关手册，但具体怎么用的话，还是要是要看测试样例

相关资料

中文文档

Dremio提供的资料

大家写的Blog

项目历史&现状简述

源码解析

node

tree_expr

to_date_holder

simple_arena

selection_vector

regex_functions/util

random_generator

project

lru_cache

llvm_types

llvm_includes

llvm_generator

测试目标

测试流程分析

1. 初始化 LLVM 生成器

2. 设置输入向量描述符

3. 定义加法函数

4. 创建函数表达式

5. 生成 LLVM 代码

6. 编译并获取函数指针

7. 准备测试数据

8. 执行生成的函数

9. 验证结果

关键技术点

1. 向量化优化验证

2. 内存布局

测试意义

literal_holder

Interval_holder

hash_utils

gandiva_object_cache

function_signature

function_register

function_ir_builder

filter

engine

encrypt

decimal_ir

附录：Arrow类型与LLVM类型的映射

总结

相关资料

中文文档

Dremio提供的资料

大家写的Blog

项目历史&现状简述

源码解析

node

tree_expr

to_date_holder

simple_arena

selection_vector

regex_functions/util

random_generator

project

lru_cache

llvm_types

llvm_includes

llvm_generator

测试目标

测试流程分析

1. 初始化 LLVM 生成器

2. 设置输入向量描述符

3. 定义加法函数

4. 创建函数表达式

5. 生成 LLVM 代码

6. 编译并获取函数指针

7. 准备测试数据

8. 执行生成的函数

9. 验证结果

关键技术点

1. 向量化优化验证

2. 内存布局

测试意义

literal_holder