Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support LLVM backend #171

Merged
merged 84 commits into from
Jul 5, 2024
Merged

Support LLVM backend #171

merged 84 commits into from
Jul 5, 2024

Conversation

leewei05
Copy link
Contributor

@leewei05 leewei05 commented Jul 1, 2024

  • Passed all code generation tests. 🥳

Changes of the old infrastructure

  • Testing: I use lli for testing generated LLVM IR code. I assumed that lli can link printf to libc, which is it's own c library, and it actually worked! In order to make this work, I did a little bit of hacking in LLVMIRGenerator::FuncCallExprNode. Whenever, there is a __builtin_print function being called, it will replaced it with printf, so that we don't need to modify our test files.
  • Makefile: I added a few flags with llvm-config.
  • main.cpp: separated QBE and LLVM code generation process with two helper functions. Added LLVM as target option, maybe we can consider changing LLVM as default target. 🐉
  • Type: created fields accessor for record type. Added MemberIndex for record type for getting the index number of members. Union will always return index 0 if member is found. Lastly, changed return_type() of FuncType to support LLVMIRUtil::GetLLVMType(), which may also need additional help.

Newly introduced

  • include/llvm/util.hpp, src/llvm/util.cpp: Some utility functions for creating LLVM IR. Also, some suggestions for storing builder_ are appreciated! 🙏
    • GetLLVMType function translate our Type into corresponding llvm::Type*, which can be directly used for CreateCall/Load/Store etc.
  • include/llvm_ir_generator.hpp: Initialize some LLVM IR builder objects. I saw people say that LLVMContext is for LLVM developers to know the details. Right now, we only need to know LLVMContext includes LLVM core infrastructures. Module is kind of like translation unit in our parser, it knows functions, global variables and symbol table, which I didn't use (Maybe it has some handy functions we can use). IRBuilder is the key object for using LLVM API to construct LLVM IR.
  • src/llvm_ir_generator.cpp: The main LLVM IR code generator file. Some differences worth noticing:
    • Comparison operator, such as, greater than, equal than, are not Binary operators in LLVM. It has a different class called llvm::CmpInst, so the implementation is a bit different than QBE. I think they separated them because comparison may have different low-level computer architecture optimization techniques than other binary operations.
    • I stole some of the ideas from QBE generator, such as num_recorder, it is val_recorder in LLVM IR generator instead. Since llvm::Value is a super important class, and any class can literally be llvm::Value class, except for llvm::Type, I decided to store llvm::Value* in the val_recorder. Upper level node can directly use the value stored in it.
    • We use GetElementPtr for locating memory address in Array, Struct, Union.
    • Each builder_->Create... will create a dest number that we can use.
    • Almost every builder_->CreateAlloc/CreateCall/CreateStore/CreateLoad requires the callee's type. It can be tricky for function pointers because we cannot get the element type of pointer. The reason is to support Opaque Pointers. The solution for function pointer's type is to return llvm::FunctionType instead of llvm::PointerType for GetLLVMType.
    • Every Basic block must only have one terminator instruction, such as br, ret. llvm::BasicBlock* can also represent as label. If we want to jump to label xx, we would jump to basic block xx.

Help wanted!

No help needed! I refactored GetLLVMType and remove redundant code! 😄

As I mentioned that calling a function requires llvm::FunctionType* ( return type + parameter types), function pointers can be tricky to handle. The ideal way is to get LLVM type from our current type system, which is the job of LLVMIRUtil::GetLLVMType, it can return the correct LLVM type for a given Type, similar with your ResolvedType. Right now, I have a val_to_type map for mapping llvm::Value* to llvm::Type*. This way, I can get the correct function type of the corresponding llvm::Value* to make a function call. My plan is to get rid of val_to_type and fully depend on our type system to get the correct LLVM type.

The current bottleneck is to change PtrType's base_type() from const Type& base_type() to const std::unique_ptr<Type>& base_type(), so that I can pass the base type of PtrType to LLVMIRUtil::GetLLVMType. Changing this also require to change IsEqual, IsConvertible etc.

Copy link
Collaborator

@Lai-YT Lai-YT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, this is a great PR that makes significant progress, and I enjoyed reviewing it. As I am not familiar with the LLVM APIs, there's not much I can support. However, as all test cases have passed, I believe there are no major issues. Small ones can be found and addressed as we evolve in the future. 🔥

I have submitted some minor revisions that I hope will make the PR even better. 😄

main.cpp Outdated Show resolved Hide resolved
src/llvm/util.cpp Outdated Show resolved Hide resolved
src/llvm/util.cpp Outdated Show resolved Hide resolved
include/llvm/util.hpp Outdated Show resolved Hide resolved
include/llvm_ir_generator.hpp Outdated Show resolved Hide resolved
src/llvm_ir_generator.cpp Outdated Show resolved Hide resolved
src/llvm_ir_generator.cpp Outdated Show resolved Hide resolved
src/llvm_ir_generator.cpp Outdated Show resolved Hide resolved
src/llvm_ir_generator.cpp Outdated Show resolved Hide resolved
src/llvm_ir_generator.cpp Outdated Show resolved Hide resolved
include/llvm/util.hpp Outdated Show resolved Hide resolved
include/llvm_ir_generator.hpp Outdated Show resolved Hide resolved
Comment on lines 75 to 82
/// @brief A LLVM object that includes core LLVM infrastructure.
std::unique_ptr<llvm::LLVMContext> context_;
/// @brief Provides LLVM Builder API for constructing IR. By default, Constant
/// folding is enabled and we have more flexibility for inserting
/// instructions.
std::unique_ptr<llvm::IRBuilder<>> builder_;
/// @brief Stores global variables, function lists, and the constructed IR.
std::unique_ptr<llvm::Module> module_;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They seem to be passed by reference when used, meaning they are not going to be copied. Furthermore, I believe these classes are not copyable, thus already unique within the code generator.

main.cpp Outdated Show resolved Hide resolved
src/llvm_ir_generator.cpp Show resolved Hide resolved
Comment on lines 157 to 159
auto var_type = llvm_util_.GetLLVMType(*(decl.type));
// For function pointer, we need to change from FunctionType to PointerType
var_type = var_type->isFunctionTy() ? var_type->getPointerTo() : var_type;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. I didn't realize the type is already modified by the conversion.

include/llvm_ir_generator.hpp Outdated Show resolved Hide resolved
}
} else if (val->getType()->isPointerTy()) {
// function pointer
auto type = llvm_util_.GetLLVMType(*(call_expr.func_expr->type));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When will LLVM require a function pointer to have a pointer type? I'm considering having GetLLVMType return the pointer type for function pointers directly. Would that be more straightforward?

Comment on lines 527 to 536
if (id_expr.type->IsPtr() || id_expr.type->IsFunc()) {
auto res = builder_->CreateLoad(llvm_util_.IntPtrType(), id_val);
val_recorder.Record(res);
val_to_id_addr[res] = id_val;
} else {
auto res =
builder_->CreateLoad(llvm_util_.GetLLVMType(*(id_expr.type)), id_val);
val_recorder.Record(res);
val_to_id_addr[res] = id_val;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see. This is because GetLLVMType returns FunctionType on function pointers. 😅

Comment on lines 559 to 560
auto res_addr = builder_->CreateConstInBoundsGEP2_32(
arr_type, base_addr, 0, (unsigned int)index->val);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, there's a dedicated page for GEP. 😮 It's a complex topic!

@leewei05 leewei05 requested a review from Lai-YT July 5, 2024 15:17
Copy link
Collaborator

@Lai-YT Lai-YT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🎉

@Lai-YT Lai-YT merged commit d787d64 into fruits-lab:main Jul 5, 2024
4 checks passed
@leewei05 leewei05 deleted the llvm-backend branch July 5, 2024 15:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants