commit | 3a82255431918bb7c2e1c09c964a18991756910b | [log] [tgz] |
---|---|---|
author | Marcel van Lohuizen <mpvl@golang.org> | Sun May 17 12:22:50 2020 +0200 |
committer | Marcel van Lohuizen <mpvl@golang.org> | Fri Jun 12 05:17:30 2020 +0000 |
tree | 6317f8c4f3b7b5b0bc89f02b2ad1a5d259dc3bf9 | |
parent | 81608d7e9c6863c922f599e4ff1329a685218c0d [diff] |
encoding/unicode: add UTF8BOM encoding Some editors always add a BOM to UTF-8. Tradionally the BOM has been used to detect byte order encoding, which is irrelevant for UTF-8. These editors, however, use the UTF-8 BOM as a signature, allowing to override a charmap encoding if such a BOM is present. This is possible as the occurence of a BOM in such encodings is highly unlikely. This UTF8BOM encoding implements a simple encoding for this. It is intended for applications that require a UTF-8 encoding, but want to handle files written by such editors without explicit BOM handling. It can also be used to create such files. NOTE: there is currently no encoding that implements the fallback encoding of such editors. The BOMOverride functlinality in this package allows implementing such an encoder, with relative ease, though. Change-Id: I430851a1d93351bf6055eebe88005984dde451d9 Reviewed-on: https://go-review.googlesource.com/c/text/+/234277 Reviewed-by: Russ Cox <rsc@golang.org>
This repository holds supplementary Go libraries for text processing, many involving Unicode.
This repo uses Semantic versioning (http://semver.org/), so
Until version 1.0.0 of x/text is reached, the minor version is considered a major version. So going from 0.1.0 to 0.2.0 is considered to be a major version bump.
A major new CLDR version is mapped to a minor version increase in x/text. Any other new CLDR version is mapped to a patch version increase in x/text.
It is important that the Unicode version used in x/text
matches the one used by your Go compiler. The x/text
repository supports multiple versions of Unicode and will match the version of Unicode to that of the Go compiler. At the moment this is supported for Go compilers from version 1.7.
The easiest way to install is to run go get -u golang.org/x/text
. You can also manually git clone the repository to $GOPATH/src/golang.org/x/text
.
To submit changes to this repository, see http://golang.org/doc/contribute.html.
To generate the tables in this repository (except for the encoding tables), run go generate from this directory. By default tables are generated for the Unicode version in core and the CLDR version defined in golang.org/x/text/unicode/cldr.
Running go generate will as a side effect create a DATA subdirectory in this directory, which holds all files that are used as a source for generating the tables. This directory will also serve as a cache.
Run
go test ./...
from this directory to run all tests. Add the “-tags icu” flag to also run ICU conformance tests (if available). This requires that you have the correct ICU version installed on your system.
TODO:
To generate the tables in this repository (except for the encoding tables), run go generate
from this directory. By default tables are generated for the Unicode version in core and the CLDR version defined in golang.org/x/text/unicode/cldr.
Running go generate will as a side effect create a DATA subdirectory in this directory which holds all files that are used as a source for generating the tables. This directory will also serve as a cache.
To update a Unicode version run
UNICODE_VERSION=x.x.x go generate
where x.x.x
must correspond to a directory in https://www.unicode.org/Public/. If this version is newer than the version in core it will also update the relevant packages there. The idna package in x/net will always be updated.
To update a CLDR version run
CLDR_VERSION=version go generate
where version
must correspond to a directory in https://www.unicode.org/Public/cldr/.
Note that the code gets adapted over time to changes in the data and that backwards compatibility is not maintained. So updating to a different version may not work.
The files in DATA/{iana|icu|w3|whatwg} are currently not versioned.
This repository uses Gerrit for code changes. To learn how to submit changes to this repository, see https://golang.org/doc/contribute.html.
The main issue tracker for the image repository is located at https://github.com/golang/go/issues. Prefix your issue with “x/text:” in the subject line, so it is easy to find.