go /
gddo /
cdd60fa89d3a1fe76aeb1b74054f6e2522dbdf69 gosrc: Allow Unicode letters in import paths.
Background
The following is a valid vanity import path that works without issues:
dmitri.shuralyov.com/temp/go-get-issue-unicode/испытание
You can go get, go install, go test, go doc it without issues:
$ go get -u dmitri.shuralyov.com/temp/go-get-issue-unicode/испытание
$ go install dmitri.shuralyov.com/temp/go-get-issue-unicode/испытание
$ go test dmitri.shuralyov.com/temp/go-get-issue-unicode/испытание
ok dmitri.shuralyov.com/temp/go-get-issue-unicode/испытание 0.014s
$ go doc dmitri.shuralyov.com/temp/go-get-issue-unicode/испытание
package испытание // import "dmitri.shuralyov.com/temp/go-get-issue-unicode/испытание"
Package испытание demonstrates Unicode capabilities in Go source code.
type Эксперимент struct{ ... }
func Испытание() Эксперимент
You can also call vcs.RepoRootForImportPath("dmitri.shuralyov.com/temp/go-get-issue-unicode/испытание", false)
(from golang.org/x/tools/go/vcs) successfully on the vanity import path:
$ goexec 'vcs.RepoRootForImportPath("dmitri.shuralyov.com/temp/go-get-issue-unicode/испытание", false)'
(*vcs.RepoRoot)(&vcs.RepoRoot{
VCS: (*vcs.Cmd)(&vcs.Cmd{
Name: (string)("Git"),
Cmd: (string)("git"),
CreateCmd: (string)("clone {repo} {dir}"),
DownloadCmd: (string)("pull --ff-only"),
TagCmd: ([]vcs.TagCmd)([]vcs.TagCmd{
(vcs.TagCmd)(vcs.TagCmd{
Cmd: (string)("show-ref"),
Pattern: (string)("(?:tags|origin)/(\\S+)$"),
}),
}),
TagLookupCmd: ([]vcs.TagCmd)([]vcs.TagCmd{
(vcs.TagCmd)(vcs.TagCmd{
Cmd: (string)("show-ref tags/{tag} origin/{tag}"),
Pattern: (string)("((?:tags|origin)/\\S+)$"),
}),
}),
TagSyncCmd: (string)("checkout {tag}"),
TagSyncDefault: (string)("checkout master"),
LogCmd: (string)(""),
Scheme: ([]string)([]string{
(string)("git"),
(string)("https"),
(string)("http"),
(string)("git+ssh"),
}),
PingCmd: (string)("ls-remote {scheme}://{repo}"),
}),
Repo: (string)("https://github.com/shurcooL-test/go-get-issue-unicode"),
Root: (string)("dmitri.shuralyov.com/temp/go-get-issue-unicode"),
})
(interface{})(nil)
However, gosrc.IsValidRemotePath incorrectly reports false for the
"dmitri.shuralyov.com/temp/go-get-issue-unicode/испытание" import path.
Fix
gosrc.IsValidRemotePath reports false for such import paths because
validPathElement regexp only allows ASCII letters A-Za-z, not Unicode
ones.
This change fixes that by using a predefined character class, the
Unicode character property class \p{L} that describes the Unicode
characters that are letters.
Additionally, fix an issue where a query parameter value was not
correctly escaped when constructing a URL.
Fixes golang/gddo#468.
Updates golang/go#18660.
References
- https://stackoverflow.com/questions/3617797/regex-to-match-only-letters
- https://stackoverflow.com/questions/6005459/is-there-a-way-to-match-any-unicode-non-alphabetic-character
- https://www.regular-expressions.info/unicode.html#prop
Change-Id: I48680749d827cbc63fefca2c21e9790009f20746
Reviewed-on: https://go-review.googlesource.com/41750
Reviewed-by: Chris Broadfoot <cbro@golang.org>
Reviewed-by: Tuo Shan <shantuo@google.com>
Reviewed-by: Francesc Campoy Flores <campoy@golang.org>
3 files changed