-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reformatting all grammars #3843
Conversation
Looks like there are 2-3 grammars that got errors. I'll investigate over the weekend. |
antlr/antlr4 failed because the .errors needs to be remastered. The example .g4's were reformatted. apt failed because there's a symbol conflict with "options" in the first rule -- it's a keyword. The Antlr4 tool does not seem to have a problem with "options=" but does with "options =". Bug in the Antlr4 Tool. For sql/tsql, the error is again a symbol conflict. "options+=" vs "options +=". Bug in Antlr4 Tool. |
Ah thanks Ken, for the analysis. Shall I revert the formatting of the example .g4's or rather adjust what you call "the .errors" (no idea what that is)? For the other 2: we can just ignore them? I wonder why these symbol conflicts do not show up in other PRs, but maybe that is because these specific grammars are seldomly touched... |
I guess just revert the formatting for the examples. The .errors files is the output from the parse. The example file itself has an error in it, and we're just checking that the parser works the same across all targets and hasn't changed from one version of the tool to the next. The symbol conflict wasn't detected because "options=" gets through the Antlr4 tool, but "options =" does not. I think just rename the element label "options" to "options_" since the grammar should work regardless of formatting. I added an Issue over in the antlr4 repo to have this corrected at some point. antlr/antlr4#4474 |
4d71758
to
35e5f38
Compare
I reverted the examples and fixed the symbol conflicts. However, I don't fully understand why |
It's not target specific. It's actually from the Antlr4 tool. It's kind of hard to see the error, but the build error is here: https://github.com/antlr/grammars-v4/actions/runs/6988622530/job/19016369650#step:22:1336. The Antlr4 tool just doesn't like the grammar with the space after the "options", whereas before reformatting there was no space after "options" and the tool seems to like the grammar. |
Simply jaw-dropping how long the validation takes in this repo... There are still a few failures, where I don't know what to do. It believe this patch is fine. |
Changing every grammar more or less means every grammar is tested for every target. And even though the work is split up on dozens of machines, a lot of the grammars--frankly--suck. That's because they contains terrible and numerous ambiguities and many fall back to full context. In some ways, this is actually a good thing because it tested a lot of the ATN interp engine code for the parser and lexer. Believe or not, the testing is better nowadays than a year or two ago because we now only test those grammars that change in a PR, not the whole 350+ grammars. The MacOS servers on Github Actions are still slow, but better. So, here's a rundown on the failures.
I think it's fine to go. I'll clean up these grammars at some point. |
Hmm, probably some of the grammars should be removed if they don't work. It's great to have such an exhaustive test framework, but taking hours to run is really a pain. Anyway, we cannot fix all problems in a single PR. @teverett any objections to merge this PR? |
@mike-lischke no objection to merging the PR. I prefer not to remove any grammars however. Thanks for this contribution. I know the new QA tools @kaby76 has worked on are great, and have the ability to exclude grammar tests for certain combinations of grammar and language. Can we use those to exclude the grammars that are having trouble so that we have a clean build? |
Right, we should not do anything in addition to this already large patch. @kaby76 can you please review the patch? Thanks! |
It's fine. I created an issue to keep track of correcting these grammars in a separate PR. It's best to get the bulk of the changes in before git conflicts start to become an issue. The code looks great. Nice and consistent. |
Is a formal approval required here to merge a patch (i.e. via the review process)? If not @teverett please proceed. |
Can we correct the buld errors before merging this? |
That would require to add changes unrelated to the patch. As I understand it these errors existed before already, but some did not show until now ("options=") or have been ignored (?)(slow PHP and Go grammars, stack overflow). I assume fixing those issues will take quite a while and I cannot do that. Additionally, with every new patch you merge I have to touch this PR again to fix merge conflicts. I would prefer to merge this PR, because it is not really the reason why some tests fail. But if someone who knows the nature of these problems better than me (@kaby76?) volunteers to fix them I'm ready to wait. |
Yes, these are problems that existed before this PR, #3847 I can make a PR asap to fix these problems. Then, you can merge the changes in to your PR. |
The PR containing the fixes is #3849 . It contains changes for these grammars: apt/ asm/ptx/ptx-isa-1.0/ elixir/ icon/ lark/ scala/ smtlibv2/ sql/tsql/ wavefront/ @mike-lischke Should I reformat the grammars with your tool in my PR, or leave as is before your PR? |
This patch is simpler than I expected. Let me just cherry pick it into my PR here. |
Wow, the build is taking a while. Maybe I should fork a job for each |
That's what I meant above :-D Maybe a job matrix helps? |
It is already in a job matrix. Currently, the matrix is Damn, now why the fail? Looking... Ah, I see, you need to remove the "Go;" from scala/desc.xml so that testing is not done for that target. I now see Github Actions has a limit of the number of concurrent jobs. https://docs.github.com/en/actions/learn-github-actions/usage-limits-billing-and-administration#usage-limits . There goes that idea. |
Sorry, I just realized I didn't set |
|
2ca56c7
to
13d8850
Compare
We are close now. Only PHP on Ubuntu failed. @kaby76 can you please take another look what's wrong this time? |
The difference between the last clean build of this PR (here) and the last build (here) is because we now have my PR build change testing PHP on Ubuntu with Powershell. There's a bug I now see in the Powershell script template, so I cannot tell why PHP failed for grammars I suggest we merge this PR in ASAP so that additional merges/rebase DO NOT cause more issues. |
Oh geezus H.... Sorry, my bug. The environment is not setting up PHP for Powershell. So, I have no idea what PHP it is running. Works fine in Bash, where it does this step. There are no steps to install and test the version of PHP in the Powershell run https://github.com/antlr/grammars-v4/actions/runs/7010383029/job/19070929695. Yeah, sorry. Let's get this in ASAP @teverett . I'll fix the workflow in a separate PR. |
So, shall we wait for your fix for PHP or still merge? |
Let's get your PR in ASAP because it's huge, and may get whacked by another merge. I'll clean up the workflow issue afterwards. The grammars work fine for PHP; the Bash and Powershell tests are redundant. (I took the Bash scripts, asked ChatGPT to translate them to yucky Powershell, fixed up the Powershell scripts because Chat doesn't know what it's doing.) |
So, I fixed this problem. @mike-lischke If you want to rebase, that's up to you. Or, @teverett could merge my PR in first, then merge PR #3843 (there shouldn't be any merge conflicts) and it should all build and test fine. Or, just merge #3843 first, ignore the error with PHP/Ubuntu, and then PR #3857, |
I'd prefer to merge my patch first. |
Well, there is a merge conflict now with sql/mysql/Positive-Technologies/MySqlParser.g4 |
@harveyyue committed something, even though we said we don't want any commit until this patch is in. How can that be? How can he push something without a review process? @teverett What if someone else decides to commit while we are waiting for the next validation and then the next and the next? Where should that end? I assumed you are the owner of this repo and have the last word what goes in 🤔 |
With this patch we introduce a common set of rules for the grammars in this repository. All future changes in a grammar must follow these rules.
They serve special purposes and are not normal grammars.
These have been there for a while already, but were never detected due to a bug in ANTLR.
13d8850
to
2f21e2b
Compare
@KvanTTT merged PR #3858, and doesn't seem to be in the loop on this PR. Hopefully, no more merges into this repo until this PR is merged first. |
OK, my apologies for the wrong accusation. It wasn't clear to me from the patch that this was a merged PR. |
In addition to the already mentioned PHP on Ubuntu problem now the Dart on macOS timed out. How to proceed @teverett? Maybe you can restart the task and maybe this time it does not time out. Or just merge straight away? |
The dart2 problem is a network issue. Github actions are pounding on servers for dart libraries and they cannot keep up with the requests. Sometimes builds work sometimes not because of load. We've seen this many times before. Merge please. |
@kaby76 @mike-lischke thanks so much for this. |
: GRANT OWNERSHIP ( | ||
ON ( | ||
object_type_name object_name | ||
| ALL object_type_plural IN ( DATABASE id_ | SCHEMA schema_name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mike-lischke any reason why the space characters after (
and before DATABASE
were kept?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, no specific reason actually. Might be a formatter bug. I believe that's coming up when an alt is placed entirely on one line and contains groups.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any reason why the space characters after
( DATABASE
were kept?
@mlorek Say again? The space characters after the string "( DATABASE"???
Line 263:
grammars-v4/sql/snowflake/SnowflakeParser.g4
Line 263 in c36b614
| ALL object_type_plural IN ( DATABASE id_ | SCHEMA schema_name) |
The grammar must have at least one space after the string ( DATABASE
, which is the same as one space after DATABASE
, which is the space currently at line 263, column 51. And, in fact, there is exactly one space at line 263, column 51 not multiple.
Otherwise, the code would be DATABASEid_
, which is surely wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sry, space character after (
and before DATABASE
| ( schema_privileges | ALL PRIVILEGES?) ON ( FUTURE SCHEMAS IN DATABASE id_) | ||
| (schema_object_privileges | ALL PRIVILEGES?) ON ( | ||
object_type object_name | ||
| ALL object_type_plural IN ( DATABASE id_ | SCHEMA schema_name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here (
This patch applies common formatting rules to all grammars in the repository. Naturally, it's very large - probably the largest patch you will ever get 😅 and if you don't like that, I can open multiple PRs. However, that will not lower the work of reviewing the changes (quite the opposite).
All grammars also got the used formatting rules attached, so if you want, you can do your own formatter run using
antlr-format-cli
.