| ## The Cultural Evolution of gofmt |
| gofmt 的文化演变 |
| The Cultural Evolution of gofmt |
| |
| Robert Griesemer |
| Google, Inc. |
| gri@golang.org |
| |
| |
| * gofmt |
| |
| ## Go source code formatter |
| - Go源代码格式化工具 |
| |
| ## Defines _de_facto_ Go formatting |
| - 定义了“标准“格式 |
| |
| ## All submitted Go code must be formatted with `gofmt` in `golang.org` repos. |
| - golang.org代码库中所有提交的Go代码都必须通过gofmt格式化过 |
| |
| ## Functionality available outside gofmt via `go/format` library. |
| - 除了gofmt之外,相同功能可以通过go/format库获得 |
| |
| ## No knobs! |
| - 不需要设置! |
| |
| ## Original motivation |
| * 初衷 |
| |
| ## Code reviews are software engineering best practice. |
| - 代码审查是软件工程的最佳实践 |
| |
| ## Code reviews are informed by style guides, prescribe formatting. |
| - 代码审查是基于代码规范和正规格式的 |
| # Google C++ style guide: ~65 pages (~15p on formatting) |
| # Google C++的规范:~65页(~15页是关于格式) |
| # Go spec: ~50 pages |
| # Go的细则:50页 |
| |
| ## *Much*too*much*time*lost*on*reviewing*formatting*rather*than*code.* |
| - *太多时间浪费在审查格式上而不是代码本身了* |
| # Example: Formatting review time 10 min/day, 600 engineers => 100 manhours/day! |
| # 例子:格式审查需要10分钟/天,600工程师=>100人时/天 |
| |
| ## Yet it's the perfect job for a machine. |
| - 但是这工作对机器来说是最好不过了的 |
| |
| ## Day 1 decision to write a pretty printer for Go. |
| - 第一个决定就是要写一个好的格式美化器 |
| # informed by experience with Java and C++ code reviews at Google |
| # 基于在Google的Java和C++的代码审查经验 |
| |
| ## History |
| * 历史 |
| |
| ## Pretty printers and code beautifiers existed since the early days of computing. |
| - 格式美化器和代码美化工具在计算机发展的早期就已出现 |
| |
| ## Essential to produce readable Lisp code: |
| - 对于产生可读的Lisp代码很重要的: |
| |
| GRINDEF (Bill Gosper, 1967) 第一个计算行长度 |
| |
| ## Many others: |
| - 其他: |
| |
| SOAP (R. Scowen et al, 1969) 简化了晦涩的算法程序 |
| NEATER2 (Ken Conrow, R. Smith, 1970) PL/1格式器,作为(早期的)纠错工具 |
| cb (Unix Version 7, 1979) C程序美化器 |
| indent (4.2 BSD, 1983) 缩进和格化化C代码 |
| 等等 |
| |
| ## More recently: |
| - 最近的: |
| |
| ClangFormat C/C++/Objective-C 格式器 |
| Uncrustify C, C++, C#, ObjectiveC, D, Java, Pawn and VALA的美化器 |
| 等等 |
| |
| |
| ## Reality check |
| * 事实上 |
| |
| ## In 2007, nobody seemed to like source code formatters. |
| - 在2007年,没人喜欢代码格式器 |
| |
| ## Exception: IDE-imposed formatting. |
| - 例外:IDE强制的格式化 |
| |
| ## But: Many programmers don't use IDEs… |
| - 但是:很多程序员不用IDE... |
| |
| ## Problem: If automatic formatting feels too destructive, it is not used. |
| - 问题:如果是格式化太具有毁坏性,那么就没有人会用 |
| |
| ## Missing insight: "good enough" uniform formatting style is better than having lots of different formats. |
| - 被忽视的观点:“刚刚好“的,统一化的格式是好过于各种不同的格式的。 |
| |
| ## Value of style guide: Uniformity, not perfection. |
| - 规范的价值在于:整齐划一,而不是完美 |
| |
| |
| ## The problem with pretty printers |
| * 好的格式美化器的问题 |
| |
| ## The more people think about their own formatting style, the more they get attached to it. |
| - 当越多人思考他们自己的格式风格的时候,他们就变得更加固执于此了 |
| # religion |
| # 宗教 |
| |
| ## Wrong conclusion: Automatic formatters must permit a lot of formatting options! |
| - 错误的结论:自动格式器必须要有很多选项! |
| |
| ## But: Formatters with too many options defeat the purpose. |
| - 但是有很多选项的格式器其实违背他们的目的 |
| # e.g., indent |
| # 比如说indent |
| |
| ## Also: Very hard to do a good job. |
| - 此外,支持很多选项是难的 |
| # combinatorial explosion of styles to test |
| # 太多组合需要测试 |
| |
| ## Respecting user intent is key. |
| - 尊重用户的想法是最关键的 |
| |
| ## Dealing with comments is hard. |
| - 处理注释是很难的 |
| |
| ## Language may add extra complexity (e.g., C macros) |
| - 语言本身也会增加很多额外的复杂度(比如,C的宏) |
| |
| |
| ## Formatting Go |
| * 格式化Go |
| |
| ## Keep it as simple as possible |
| * 尽量保证其简单 |
| |
| ## Small language makes task much simpler. |
| - 小的语言能让事情变得简单 |
| |
| ## Don't fret over line length control. |
| - 不要为行长度烦恼 |
| |
| ## Instead, respect user: Consider line breaks in original source. |
| - 相反的,尊重用户:考虑原有代码中的断行 |
| |
| ## Don't support any options. |
| - 不要支持任何选项 |
| |
| ## Make it easy to use. |
| - 使其使用傻瓜化 |
| |
| ## *One*formatting*style*to*rule*them*all!* |
| *一个格化标准搞定所有!* |
| |
| |
| ## Basic structure of gofmt |
| * gofmt的基本结构 |
| |
| ## Parsing of source code |
| - 源代码的处理 |
| |
| ## Basic formatting |
| - 基本的格式化 |
| |
| ## Enhancement: Handling of comments |
| - 附加:注释的处理 |
| |
| ## Make it nice: Alignment of code and comments |
| - 完善:代码和注释的对齐 |
| |
| ## But: No fancy general layout algorithms. |
| - 但是,没有牛X的通用布局算法 |
| |
| ## Instead: Node-specific fine tuning. |
| - 相反的:基于节点的精细优化 |
| |
| |
| ## Parsing source code |
| * 处理源代码 |
| |
| ## Use `go/scanner`, `go/parser`, and friends. |
| - 使用`go/scanner`, `go/parser`及其相关的库 |
| |
| ## Result is an abstract syntax tree (`go/ast`) for each `.go` file. |
| - 给每一个go文件生成一个抽象语法树 |
| # misnomer: AST is actually a concrete syntax tree |
| # 用词不当:抽象语法树其实是一个具体语法树 |
| |
| ## Each syntactic construct has a corresponding AST node. |
| - 每一个语法结构都有相应的AST节点 |
| |
| // Syntax of an if statement. |
| IfStmt = "if" [ SimpleStmt ";" ] Expression Block [ "else" ( IfStmt | Block ) ] . |
| |
| // An IfStmt node represents an if statement. |
| IfStmt struct { |
| If token.Pos // position of "if" keyword |
| Init Stmt // initialization statement; or nil |
| Cond Expr // condition |
| Body *BlockStmt |
| Else Stmt // else branch; or nil |
| } |
| |
| ## AST nodes have (selected) position information. |
| - AST节点有(选择性的)位置信息。 |
| |
| |
| ## Basic formatting |
| * 基本的格式化 |
| |
| ## Traverse AST and print nodes. |
| - 遍历AST然后打印每个节点 |
| |
| case *ast.IfStmt: |
| p.print(token.IF) |
| p.controlClause(false, s.Init, s.Cond, nil) |
| p.block(s.Body, 1) |
| if s.Else != nil { |
| p.print(blank, token.ELSE, blank) |
| switch s.Else.(type) { |
| case *ast.BlockStmt, *ast.IfStmt: |
| p.stmt(s.Else, nextIsRBrace) |
| default: |
| p.print(token.LBRACE, indent, formfeed) |
| p.stmt(s.Else, true) |
| p.print(unindent, formfeed, token.RBRACE) |
| } |
| } |
| |
| ## Printer (`p.print`) accepts a sequence of tokens, including position and white space information. |
| - 打印器(`p.print`)接收包括位置和空格符等的一系列记号 |
| |
| |
| ## Fine tuning |
| * 细致的调节 |
| |
| ## Precedence-dependent spacing between operands. |
| - 基于优先级安排操作数之间的空格. |
| |
| # implemented by rsc in gofmt |
| |
| ## Improves readability of expressions. |
| - 提高表达式的可读性. |
| |
| x = a + b |
| x = a + b*c |
| if a+b <= d { |
| if a+b*c <= d { |
| |
| ## Use position information to guide line break decisions. |
| - 使用位置信息决定何时换行. |
| |
| ## Various other heuristics. |
| - 其他一些策略. |
| |
| |
| ## Handling of comments |
| * 注释的处理 |
| |
| ## Comments can appear between any two tokens of a program. |
| - 注释可以出现在程序的任何两个词汇之间. |
| |
| ## In general, not obviously clear to which AST node a comment belongs. |
| - 通常情况下,不能很明显的知道注释属于哪个 AST 节点. |
| |
| # In retrospect, a heuristic might have been better than the list of comments |
| # we have now. See Lessons learned. |
| |
| ## Comments often come in groups: |
| - 注释经常是成组出现: |
| |
| // A CommentGroup represents a sequence of comments |
| // with no other tokens and no empty lines between. |
| // |
| type CommentGroup struct { |
| List []*Comment // len(List) > 0 |
| } |
| |
| ## Grouped comments treated as a single larger comment. |
| - 成组的注释被处理为一个大的注释. |
| |
| |
| ## Representation of comments in the AST |
| * 注释在 AST 上的表达 |
| |
| ## Sequential list of comment groups attached to the ast.File node. |
| - 注释组的连续列表被连接到 AST 的文件节点. |
| |
| # In retrospect this was not a good decision. It's general but puts burden on AST clients. |
| |
| ## Additionally, comments that are identified as _doc_strings_ are attached to declaration nodes. |
| - 另外,一些被标示为 _doc_strings_ 的注释被连接到声明节点. |
| |
| .image ./gofmt/comments.jpg 425 600 |
| |
| |
| ## Formatting with comments |
| * 格式化注释 |
| |
| ## Basic idea: Merge "token stream" with "comment stream" based on position information. |
| - 基本的办法:基于位置信息合并词汇流和注释流. |
| |
| .image ./gofmt/merge.jpg 425 700 |
| |
| |
| ## Devil is in the details |
| * 魔鬼就在细节中 |
| |
| # It's an entire hell of devils, really. |
| |
| ## Estimate current position in "source code space". |
| - 在源代码中估计当前的位置. |
| |
| ## Compare current position with comment position to decide what's next. |
| - 比较当前的位置和注释的位置去决定下一个是什么. |
| |
| ## Token stream also contains "white space" tokens - comments must be properly interspersed! |
| - 词汇也包含了空格词汇 - 注释必须被合理的分布! |
| |
| ## Maintain buffer of unprinted white space, flush before next token, intersperse comments. |
| - 维持一个未被打印的空格缓冲区,在下一个词汇之前输出,然后分布注释. |
| |
| ## Various heuristics to get white space correct. |
| - 多种策略得以正确地处理空格. |
| |
| ## Lots of trial and error. |
| - 很多次的尝试和错误. |
| |
| |
| ## Formatting individual comments |
| * 格式化单独的注释 |
| |
| ## Distinguish between line and general comments. |
| - 区分代码行和注释. |
| |
| ## Try to properly indent multi-line general comments: |
| - 努力对多行注释进行合理的缩进. |
| |
| func f() { func() { |
| /* /* |
| * foo * foo |
| * bar ==> * bar |
| * bal * bal |
| */ */ |
| if ... if ... |
| } } |
| |
| ## Doesn't always work well. |
| - 但并不总是能够处理正确. |
| |
| ## Want both: comments indented, and comment contents left alone. No good solution. |
| - 想达到两个效果:注释能够缩进,注释的内容不进行处理。还没有好的解决办法. |
| |
| |
| ## Alignment |
| * 对齐 |
| |
| ## Carefully chosen alignment can make code easier to read: |
| - 仔细选择的对齐可以让代码更容易阅读. |
| |
| var ( var ( |
| x, y int = 2, 3 // foo x, y int = 2, 3 // foo |
| z float32 // bar ==> z float32 // bar |
| s string // bal s string // bal |
| ) ) |
| |
| ## Painful to maintain manually (regular tabs don't do the job). |
| - 很难进行手工维护 (制表符并不能够做到). |
| |
| ## Perfect job for a formatter. |
| - 但是却非常适合使用格式化工具. |
| |
| |
| ## Elastic tabstops |
| * 灵活的制表符宽度 |
| |
| ## Regular tabs (`\t`) advance writing position to fixed tab stops. |
| 通常的制表符把当前的写位置移动到下一个固定的位置. |
| |
| ## Basic idea: Make tab stops _elastic_. |
| 基本的办法:让制表符宽度更加灵活. |
| |
| ## A tab is used to indicate the _end_ of a text _cell_. |
| - 制表符可以标示一个文本单元的结束位置. |
| |
| ## A _column_block_ is a run of uninterrupted vertically adjacent cells. |
| - 一个列块是一个连续的相邻的单元. |
| |
| ## A column block is as wide as the widest piece of text in the cells. |
| - 一个列块的宽度可以到达多个单元里最宽文本的宽度. |
| |
| ## Proposed by Nick Gravgaard, 2006 |
| 被 Nick Gravgaard 提出于2006 |
| |
| .link http://nickgravgaard.com/elastic-tabstops/ |
| |
| ## Implemented by `text/tabwriter` package. |
| 实现在 `text/tabwriter` 包中. |
| |
| |
| ## Elastic tabstops illustrated |
| * 灵活制表符宽度的展示 |
| |
| .image ./gofmt/tabstops.jpg 500 700 |
| |
| |
| ## Putting it all together (1) |
| * 综合在一起 (1) |
| |
| ## Parser generates AST. |
| - 分析器生成 AST. |
| |
| ## Printer prints AST recursively, uses tabs (`\t`) to indicate elastic tab spots. |
| - 打印工具递归地打印AST,使用制表符去灵活的标示制表符的位置. |
| |
| ## The resulting token, position, and whitespace stream is merged with the "stream" of comments. |
| - 产生的词汇,位置和空格流会和注释流进行合并. |
| |
| ## Tokens are expanded into strings; all text flows through a tabwriter. |
| - 词汇会扩展为字符串,所有的文本流将会被制表符写入器处理. |
| |
| ## Tabwriter replaces tabs with appropriate amount of blanks. |
| - 制表符写入器会将制表符替换为合适数量的空格. |
| |
| ## Works well for fixed-width fonts. |
| 对于固定宽度的字体,处理的很好. |
| |
| ## Proportional fonts could be handled by an editor supporting elastic tab stops. |
| 比例大小的字体也可以被编辑器支持,如果这个编辑器可以支持灵活的制表符宽度. |
| |
| # go/printer can produce output containing elastic tab stops |
| |
| |
| ## Putting it all together (2) |
| * 综合在一起 (2) |
| |
| .image ./gofmt/bigpic.jpg 550 800 |
| |
| |
| ## The big picture |
| * 从宏观上看 |
| |
| .image ./gofmt/biggerpic.jpg 400 800 |
| |
| |
| ## gofmt applications |
| * gofmt 的应用 |
| |
| ## gofmt as source code transformer |
| * gofmt 作为源代码变换工具 |
| |
| ## Go rewriter (Russ Cox), `gofmt`-r` |
| - 改写 Go 的代码 (Russ Cox), `gofmt`-r` |
| |
| gofmt -w -r 'a[i:len(x)] -> a[i:]' *.go |
| |
| ## Go simplifier, `gofmt`-s` |
| - 简化 Go 的代码, `gofmt`-s` |
| |
| ## API updater (Russ Cox), `go`fix` |
| - 更新 API (Russ Cox), `go`fix` |
| |
| ## Language changes (removal of semicolons, others) |
| - 改变语言 (去掉分号,其它) |
| |
| ## goimport |
| - goimport (Brad Fitzpatrick) |
| |
| |
| ## Reactions |
| * 大家的反应 |
| |
| ## The Go project mandates that all submitted code is gofmt-ed. |
| - Go 项目要求所有提交的源代码都用 gofmt 的格式。 |
| |
| ## First, complaints: `gofmt` doesn't do _my_ style! |
| - 一开始,大家都抱怨:`gofmt` 不知道怎样格式成我的风格! |
| |
| ## Eventually, acquiescence: The Go Team really means it! |
| - 慢慢地,大家不作声了:Go 项目组一定要用 gofmt! |
| |
| ## Finally, insight: gofmt's style is _nobody's_ favorite, yet `gofmt` is everybody's favorite. |
| - 最后,大家看清了:gofmt 不是任何人的风格,但所有人都喜欢 gofmt 的风格。 |
| |
| ## Now, praise: `gofmt` is one of the many reasons why people like Go. |
| - 现在,大家都赞扬: `gofmt` 是大家喜欢 Go 的一个原因。 |
| |
| ## Formatting has become a non-issue. |
| 现在,格式已经不是一个问题。 |
| |
| |
| ## Others are starting to take note |
| * 其它语言也在向我们学习 |
| |
| ## Formatter for Google's BUILD files (Russ Cox). |
| - Google 的 BUILD 文件现在也有格式器 (Russ Cox). |
| |
| ## Java formatter |
| - Java 格式器 |
| |
| ## Clang formatter |
| - Clang 格式器 |
| |
| - Dartfmt |
| .link https://www.dartlang.org/tools/dartfmt/ |
| |
| ## etc. |
| - 等等 |
| |
| ## Automatic source code formatting is becoming a requirement for any kind of language. |
| 现在,任何语言都被要求带有自动的源代码格式器。 |
| |
| |
| ## Conclusions |
| * 总结 |
| |
| ## Evolution in programming culture |
| * 编程文化的演变 |
| |
| ## `gofmt` is significant selling point for Go |
| - `gofmt` 是 Go 语言的一个重要的卖点 |
| |
| ## Insight is spreading that uniform "good enough" formatting is hugely beneficial. |
| - 大家渐渐达成共识:一致的“足够好“的格式很有好处 |
| |
| # no need for detailed formatting style guides |
| # 无需详细的格式风格手册 |
| |
| # no time wasted on formatting |
| # 无需在格式上浪费时间 |
| |
| # improved readability |
| # 代码的可读性提高了 |
| |
| # smaller diffs when changing code |
| # 改代码时代码的变动变小了 |
| |
| ## Source code manipulation at AST-level enables a new category of tools. |
| - 这种在 AST-级别上的源代码操作带动了一系列的新的工具。 |
| |
| # simple to complex automatic source code transformations |
| # 各种各样的,从简单到复杂的,自动的源代码变换 |
| |
| # various auto-completion mechanisms (e.g. goimport) |
| # 各种自动完成的机制 (例如 goimport) |
| |
| # enables syntax evolution |
| # 使得语法可以慢慢进化 |
| |
| ## Others are taking note: Programming culture is slowly evolving. |
| - 其它语言也在向我们学习:编程的文化在慢慢演变。 |
| |
| |
| ## Lessons learned: Application |
| * 至今的收获:应用程序 |
| |
| ## Basic source code formatting is great initial goal. |
| - 一开始,基本的源代码格式化是一个很好的目标。 |
| |
| ## True power lies in source code transformation tools. |
| - 但是,真正的用处在于源代码的变换工具。 |
| |
| ## Avoid formatting options. |
| - 不要给大家有选择格式的机会。 |
| |
| ## Keep it simple. |
| - 越简单越好。 |
| |
| ## Want: |
| 我们想要: |
| |
| ## Go parser: source code => syntax tree |
| - Go 分析器:源代码 => 语法树 |
| |
| ## Make it easy to manipulate syntax tree in any way possible. |
| - 尽可能让语法树的操作变得容易。 |
| |
| ## Go printer: syntax tree => source code |
| - Go 打印器:语法树 => 源代码 |
| |
| |
| ## Lessons learned: Implementation |
| * 至今的收获:实现过程 |
| |
| ## Lots of trial and error in initial version. |
| - 最初的版本有很多的尝试和失败。 |
| |
| ## Single biggest mistake: comments not attached to AST nodes. |
| - 最大的错误:注释没有连到 AST-节点上. |
| |
| ## => Current design makes it extremely hard to manipulate AST |
| ## and maintain comments in right places. |
| => 现在的设计使得操作 AST 和保持注释在正确的地方十分困难。 |
| |
| ## Cludge: ast.CommentMap |
| - 很混乱:ast.CommentMap |
| |
| ## Want: |
| 我们想要: |
| |
| ## Easy to manipulate syntax tree with comments attached. |
| - 容易操作语法树,连带注释。 |
| |
| |
| ## Going forward |
| * 将来的计划 |
| |
| ## Design of new syntax tree in the works (still experimental). |
| - 正在设计新的语法树(仍在试验阶段) |
| |
| ## Syntax tree simpler and easier to manipulate (e.g., declaration nodes) |
| - 语法树操作起来更加简单和容易(例如:声明结点) |
| |
| ## Faster and easier to use parser and printer. |
| - 更快和更容易地使用分析器和打印器。 |
| |
| ## Make it robust and fast. Don't do anything else. |
| - 让工具用起来可靠并且快。其它一概不理。 |
| |
| # no semantic analyses in parser |
| # 分析器不作语义分析。 |
| |
| # no options in printer |
| # 打印器里没有任何选择。 |
| |
| |
| # ---------------------------------------------------------------------------------- |
| # |
| # Implementation size |
| # |
| # go/token 849 lines lexical tokes, source positions |
| # go/scanner 884 lines tokenization |
| # go/parser 2689 lines parsing |
| # go/ast 2966 lines abstract syntax tree, tree traversal |
| # go/printer 2948 lines actual AST printer |
| # go/format 115 lines helper library to make printer easy to use |
| # internal/format 161 lines |
| # cmd/gofmt 801 lines gofmt tool |
| # ---------------------------- |
| # 11413 lines |