-
Notifications
You must be signed in to change notification settings - Fork 117
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CIR][CIRGen] Support __builtin_isinf_sign #1142
base: main
Are you sure you want to change the base?
Conversation
as title. Also add function buildCommonNeonBuiltinExpr just like OG's emitCommonNeonBuiltinExpr. This might help consolidate neon cases and share common code. Notice: - I pretty much keep the skeleton of OG's emitCommonNeonBuiltinExpr at the cost of that we didn't use a few variables they calculate. They might help in the future. - The purpose of having CommonNeonBuiltinExpr is to reduce implementation code duplication. So far, we only have one type implemented, and it's hard for CIR to be more generic. But we should see if in future we can have different types of intrinsics share more generic code path. --------- Co-authored-by: Guojin He <[email protected]>
…no override (llvm#893) As title. The test case used is abort(), but it is from the real code. Notice: Since CIR implementation for NoReturn Call is pending to implement, the generated llvm code is like: `define dso_local void @test() llvm#1 { call void @abort(), !dbg !8 ret void }` which is not right, right code should be like, ` `define dso_local void @test() llvm#1 { call void @abort(), !dbg !8 unreachable }` ` Still send this PR as Noreturn implementation is a separate issue.
as title. The test cases are from [clang codegen test case](https://github.com/llvm/clangir/blob/52323c17c6a3708b3eb72651465f7d4b82f057e7/clang/test/CodeGen/builtins.c#L37)
Before this patch, the CC lowering pass was applied only when explicitly requested by the user. This update changes the default behavior to always apply the CC lowering pass, with an option to disable it using the `-fno-clangir-call-conv-lowering` flag if necessary. The primary objective is to make this pass a mandatory step in the compilation pipeline. This ensures that future contributions correctly implement the CC lowering for both existing and new targets, resulting in more consistent and accurate code generation. From an implementation perspective, several `llvm_unreachable` statements have been substituted with a new `assert_or_abort` macro. This macro can be configured to either trigger a non-blocking assertion or a blocking unreachable statement. This facilitates a test-by-testa incremental development as it does not required you to know which code path a test will trigger an just cause a crash if it does. A few notable changes: - Support multi-block function in CC lowering - Ignore pointer-related CC lowering - Ignore no-proto functions CC lowering - Handle missing type evaluation kinds - Fix CC lowering for function declarations - Unblock indirect function calls - Disable CC lowering pass on several tests
…ntrinsicString (llvm#899) as title. In addition, this PR has 2 extra changes. 1. change return type of GetNeonType into mlir::cir::VectorType so we don't have to do cast all the time, this is consistent with [OG](https://github.com/llvm/clangir/blob/db6b7c07c076cb738d0acae248d7c3c199b2b952/clang/lib/CodeGen/CGBuiltin.cpp#L6234) as well. 2. add getAArch64SIMDIntrinsicString helper function so we have better debug info when hitting NYI in buildCommonNeonBuiltinExpr --------- Co-authored-by: Guojin He <[email protected]>
Then we can observe the time consumed in different part of CIR. This patch is not complete. But I think it is fine given we can always add them easily.
> To keep information about whether an OpenCL kernel has uniform work > group size or not, clang generates 'uniform-work-group-size' function > attribute for every kernel: > > "uniform-work-group-size"="true" for OpenCL 1.2 and lower, > "uniform-work-group-size"="true" for OpenCL 2.0 and higher if '-cl-uniform-work-group-size' option was specified, > "uniform-work-group-size"="false" for OpenCL 2.0 and higher if no '-cl-uniform-work-group-size' options was specified. > If the function is not an OpenCL kernel, 'uniform-work-group-size' > attribute isn't generated. > > *From [Differential 43570](https://reviews.llvm.org/D43570)* This PR introduces the `OpenCLKernelUniformWorkGroupSizeAttr` attribute to the ClangIR pipeline, towards the completeness in attributes for OpenCL. While this attribute is represented as a unit attribute in MLIR, its absence signifies either non-kernel functions or a `false` value for kernel functions. To match the original LLVM IR behavior, we also consider whether a function is an OpenCL kernel during lowering: * If the function is not a kernel, the attribute is ignored. No LLVM function attribute is set. * If the function is a kernel: * and the `OpenCLKernelUniformWorkGroupSizeAttr` is present, we generate the LLVM function attribute `"uniform-work-group-size"="true"`. * If absent, we generate `"uniform-work-group-size"="false"`.
…#897) `CIRGenModule::buildGlobal` --[rename]--> `CIRGenModule::getOrCreateCIRGlobal` We already have `CIRGenModule::buildGlobal` that corresponds to `CodeGenModule::EmitGlobal`. But there is an overload of `buildGlobal` used by `getAddrOfGlobalVar`. Since this name is confusing, this PR rename it to `getOrCreateCIRGlobal`. Note that `getOrCreateCIRGlobal` already exists. It is intentional to make the renamed function an overload to it. The reason here is that the renamed function is basically a wrapper of the original `getOrCreateCIRGlobal` with more specific parameters: `getOrCreateCIRGlobal(decl, type, isDef)` --[call]--> `getOrCreateCIRGlobal(getMangledName(decl), type, decl->getType()->getAS(), decl, isDef)`
…m#901) just as title. --------- Co-authored-by: Guojin He <[email protected]>
…aller pieces (llvm#902) The missing feature flag for OpenCL has very few occurrences now. This PR rearranges them into proper pieces to better track them.
Fix llvm#801 (the remaining `constant` part). Actually the missing stage is CIRGen. There are two places where `GV.setConstant` is called: * `buildGlobalVarDefinition` * `getOrCreateCIRGlobal` Therefore, the primary test `global-constant.c` contains a global definition and a global declaration with use, which should be enough to cover the two paths. A test for OpenCL `constant` qualified global is also added. Some existing testcases need tweaking to avoid failure of missing constant.
as title. --------- Co-authored-by: Guojin He <[email protected]>
Consider the following code snippet `tmp.c`: ``` #define N 3200 struct S { double a[N]; double b[N]; } s; double *b = s.b; void foo() { double x = 0; for (int i = 0; i < N; i++) x += b[i]; } int main() { foo(); return 0; } ``` Running `bin/clang tmp.c -fclangir -o tmp && ./tmp` causes a segmentation fault. I compared the LLVM IR with and without CIR and noticed a difference which causes this: `@b = global ptr getelementptr inbounds (%struct.S, ptr @s, i32 0, i32 1)` // no CIR `@b = global ptr getelementptr inbounds (%struct.S, ptr @s, i32 1)` // with CIR It seems there is a missing index when creating global pointers from structs. I have updated `Lowering/DirectToLLVM/LowerToLLVM.cpp`, and added a few tests.
as title. Notice this is not target specific nor neon intrinsics.
Entails several minor changes: - Duplicate resume blocks around. - Disable LP caching, we repeat them as often as necessary. - Update maps accordingly for tracking places to patch up. - Make changes to clean up block handling. - Fix an issue in flatten cfg.
as title. The current implementation of this PR is use cir::CastOP integral casting to implement vector type truncation. Thus, LLVM lowering code has been change to accommodate it. In addition. Added code into [CIRGenBuiltinAArch64.cpp](https://github.com/llvm/clangir/pull/909/files#diff-6f7700013aa60ed524eb6ddcbab90c4dd288c384f9434547b038357868334932) to make it more similar to OG. ``` mlir::Type ty = vTy; if (!ty) ``` Added test case into neon.c as the file already contains similar vector move test cases such as vmovl --------- Co-authored-by: Guojin He <[email protected]>
…m#935) as title. Also changed [neon-ldst.c](https://github.com/llvm/clangir/compare/main...ghehg:clangir-llvm-ghehg:macM3?expand=1#diff-ea4814b6503bff2b7bc4afc6400565e6e89e5785bfcda587dc8401d8de5d3a22) to make it have the same RUN options as OG [clang/test/CodeGen/aarch64-neon-intrinsics.c](https://github.com/llvm/clangir/blob/main/clang/test/CodeGen/aarch64-neon-intrinsics.c) Those options help us to avoid checking load/store pairs thus make the test less verbose and easier to compare against OG. Co-authored-by: Guojin He <[email protected]>
Implement derived-to-base address conversions for non-virtual base classes. The code gen for this situation was only implemented when the offset was zero, and it simply created a `cir.base_class_addr` op for which no lowering or other transformation existed. Conversion to a virtual base class is not yet implemented. Two new fields are added to the `cir.base_class_addr` operation: the byte offset of the necessary adjustment, and a boolean flag indicating whether the source operand may be null. The offset is easy to compute in the front end while the entire path of intermediate classes is still available. It would be difficult for the back end to recompute the offset. So it is best to store it in the operation. The null-pointer check is best done late in the lowering process. But whether or not the null-pointer check is needed is only known by the front end; the back end can't figure that out. So that flag needs to be stored in the operation. `CIRGenFunction::getAddressOfBaseClass` was largely rewritten. The code path no longer matches the equivalent function in the LLVM IR code gen, because the generated ClangIR is quite different from the generated LLVM IR. `cir.base_class_addr` is lowered to LLVM IR as a `getelementptr` operation. If a null-pointer check is needed, then that is wrapped in a `select` operation. When generating code for a constructor or destructor, an incorrect `cir.ptr_stride` op was used to convert the pointer to a base class. The code was assuming that the operand of `cir.ptr_stride` was measured in bytes; the operand is the number elements, not the number of bytes. So the base class constructor was being called on the wrong chunk of memory. Fix this by using a `cir.base_class_addr` op instead of `cir.ptr_stride` in this scenario. The use of `cir.ptr_stride` in `ApplyNonVirtualAndVirtualOffset` had the same problem. Continue using `cir.ptr_stride` here, but temporarily convert the pointer to type `char*` so the pointer is adjusted correctly. Adjust the expected results of three existing tests in response to these changes. Add two new tests, one code gen and one lowering, to cover the case where a base class is at a non-zero offset.
Fix llvm#934 While here move scope op codegen outside the builder, so it's easier to dump blocks and operations while debugging.
…m#1169) For example, the following reaches ["NYI"](https://github.com/llvm/clangir/blob/c8b626d49e7f306052b2e6d3ce60b1f689d37cb5/clang/lib/CIR/Dialect/Transforms/TargetLowering/LowerFunction.cpp#L348) when lowering to AArch64: ``` typedef struct { union { struct { char a, b; }; char c; }; } A; void foo(A a) {} void bar() { A a; foo(a); } ``` Currently, the value of the struct becomes a bitcast operation, so we can simply extend `findAlloca` to be able to trace the source alloca properly, then use that for the [coercion](https://github.com/llvm/clangir/blob/c8b626d49e7f306052b2e6d3ce60b1f689d37cb5/clang/lib/CIR/Dialect/Transforms/TargetLowering/LowerFunction.cpp#L341) through memory. I have also added a test for this case.
Added a few FIXMEs. There are 2 types of FIXMEs; 1. Most of them are missing func call and parameter attributes. I didn't add for all missing sites for this type as it would have been just copy pastes. 2. FIXME in lambda __invoke(): OG simply returns but CIR generates call to llvm.trap. This is just temporary and we will fix in in near future. But I feel I should still list those IRs so once we fix problem with codegen of invoke, we'd get test failure on this one and fix it. Actually, this way, this test file would be a natural test case for implementation of invoke.
There are scenarios where we are not emitting cleanups, this commit starts to pave the way to be more complete in that area. Small addition of skeleton here plus some fixes. Both `clang/test/CIR/CodeGen/vla.c` and `clang/test/CIR/CodeGen/nrvo.cpp` now pass in face of this code path.
d0f75b4
to
7c8ffe9
Compare
…lvm#1166) Close llvm#1131 This is another solution to llvm#1160 This patch revert llvm#1007 and remain its test. The problem described in llvm#1007 is workaround by skipping the check of equivalent of element types in arrays. We can't mock such checks simply by adding another attribute to `ConstStructAttr` since the types are aggregated. e.g., we have to handle the cases like `struct { union { ... } }` and `struct { struct { union { ... } } }` and so on. To make it, we have to introduce what I called "two type systems" in llvm#1160. This is not very good giving it removes a reasonable check. But it might not be so problematic since the Sema part has already checked it. (Of course, we still need face the risks to introduce new bugs any way)
I'm assuming |
// see https://github.com/llvm/clangir/issues/480 | ||
// fix the issue can eliminate lots of redundant cast instruction | ||
// for IsInf, i1 -> i8 -> i1 | ||
// for IsNeg, i1 -> i8 -> i32 -> i1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@orbiri fyi (you might want to track this too)
…of floating type (llvm#1174) [PR1132](llvm#1132) implements missing feature `fpUnaryOPsSupportVectorType`, so revisit this code. One another thing changed is that I stopped using `cir::isAnyFloatingPointType` as it contains types like long double and FP80 which are not supported by the [builtin's signature](https://clang.llvm.org/docs/LanguageExtensions.html#vector-builtins)
[OG's implementation ](https://github.com/llvm/clangir/blob/aaf38b30d31251f3411790820c5e1bf914393ddc/clang/lib/CodeGen/CGBuiltin.cpp#L7527) provides one common code to handle all neon SISD intrinsics. But IMHO, it entangles different things together which hurts readability. Here, We start with simple easy-to-understand approach with specific case. And in the future, as we handle more intrinsics, we may come up with a few simple common patterns.
Can you explain why can’t signbit return a Boolean? What is its other usages? (I don’t have full context so that would help me to understand better this patch 😇) Looking around, it seems even that C++ is exposing this API with a Boolean. https://en.cppreference.com/w/cpp/numeric/math/signbit Perhaps it is time to convert this op to Boolean as well? :) |
please see the error in #1187 |
If you are inquiring whether their semantics are equivalent, the answer is yes. |
This error indicates that there’s either an error in your codegen or in the lowering code. In either way, I would not recommend increasing the tech debt with this change. I would recommend using the -debug flag of cir-opt and inspect the lowering step by step! |
Co-authored-by: Sirui Mu <[email protected]>
Thanks @orbiri, way to go. Reviewed and landed #1187, @PikachuHyA let me know when this PR is ready again |
This PR adds `clang::CodeGenOptions` to the lowering context. Similar to `clang::LangOptions`, the code generation options are currently set to the default values when initializing the lowering context. Besides, this PR also adds a new attribute `#cir.opt_level`. The attribute is a module-level attribute and it holds the optimization level (e.g. -O1, -Oz, etc.). The attribute is consumed when initializing the lowering context to populate the `OptimizationLevel` and the `OptimizeSize` field in the code generation options. CIRGen is updated to attach this attribute to the module op.
Removes some NYIs. But left assert(false) due to missing tests. It looks better since it is not so scaring as NYI.
This PR adds support for base-to-derived and derived-to-base casts on pointer-to-data-member values. Related to llvm#973.
f88963e
to
650c796
Compare
@orbiri After updating I left a comment in // FIXME: CIR currently converts cir::BoolType to i8 type unconditionally.
// See https://github.com/llvm/clangir/issues/480
// Fixing this issue will eliminate redundant cast instructions
// for IsInf and IsNeg: i1 -> i8 -> i1 The LLVM IR generated by running: ./bin/clang ../clang/test/CIR/CodeGen/builtin-isinf-sign.c -Xclang -emit-llvm -o t.ll -c -fclangir is as follows: define dso_local i32 @test_float_isinf_sign(float %0) #0 {
%2 = alloca float, i64 1, align 4
%3 = alloca i32, i64 1, align 4
store float %0, ptr %2, align 4
%4 = load float, ptr %2, align 4
%5 = call float @llvm.fabs.f32(float %4)
%6 = call i1 @llvm.is.fpclass.f32(float %5, i32 516)
%7 = zext i1 %6 to i8
%8 = bitcast float %4 to i32
%9 = icmp slt i32 %8, 0
%10 = zext i1 %9 to i8
%11 = trunc i8 %10 to i1
%12 = select i1 %11, i32 -1, i32 1
%13 = trunc i8 %7 to i1
%14 = select i1 %13, i32 %12, i32 0
store i32 %14, ptr %3, align 4
%15 = load i32, ptr %3, align 4
ret i32 %15
} As shown, there are unnecessary conversions: Additionally, the LLVM IR generated by running: ./bin/clang ../clang/test/CIR/CodeGen/builtin-isinf-sign.c -Xclang -emit-llvm -o t.orig.ll -c is: define dso_local i32 @test_float_isinf_sign(float noundef %x) #0 {
entry:
%x.addr = alloca float, align 4
store float %x, ptr %x.addr, align 4
%0 = load float, ptr %x.addr, align 4
%1 = call float @llvm.fabs.f32(float %0) #2
%isinf = fcmp oeq float %1, 0x7FF0000000000000
%2 = bitcast float %0 to i32
%3 = icmp slt i32 %2, 0
%4 = select i1 %3, i32 -1, i32 1
%5 = select i1 %isinf, i32 %4, i32 0
ret i32 %5
} |
@orbiri ping |
It will be sorted out very soon! Don’t optimize on the llvm output but rather on the CIR output :) |
No description provided.