vlad 6f123ff03e 18.07 | %!s(int64=6) %!d(string=hai) anos | |
---|---|---|
.. | ||
encoding | %!s(int64=6) %!d(string=hai) anos | |
.npmignore | %!s(int64=6) %!d(string=hai) anos | |
.travis.yml | %!s(int64=6) %!d(string=hai) anos | |
LICENSE | %!s(int64=6) %!d(string=hai) anos | |
README.md | %!s(int64=6) %!d(string=hai) anos | |
index.js | %!s(int64=6) %!d(string=hai) anos | |
match.js | %!s(int64=6) %!d(string=hai) anos | |
package.json | %!s(int64=6) %!d(string=hai) anos | |
yarn.lock | %!s(int64=6) %!d(string=hai) anos |
Chardet is a character detection module for NodeJS written in pure Javascript. Module is based on ICU project http://site.icu-project.org/, which uses character occurency analysis to determine the most probable encoding.
npm i chardet
var chardet = require('chardet');
chardet.detect(new Buffer('hello there!'));
// or
chardet.detectFile('/path/to/file', function(err, encoding) {});
// or
chardet.detectFileSync('/path/to/file');
Sometimes, when data set is huge and you want to optimize performace (in tradeoff of less accuracy), you can sample only first N bytes of the buffer:
chardet.detectFile('/path/to/file', { sampleSize: 32 }, function(err, encoding) {});
Currently only these encodings are supported, more will be added soon.