Important: This documentation covers Yarn 1 (Classic).
For Yarn 2+ docs and migration guide, see yarnpkg.com.

Package detail

scikitjs

javascriptdata839ISC1.24.0TypeScript support: included

Scikit-Learn for JS

pandas, data-analysis, data-manipulation, analysis

readme

scikit.js

Coverage Status Release

TypeScript package for predictive data analysis, data preparation and machine learning.

Aims to be a Typescript port of the scikit-learn python library.

This library is for users who wish to train or deploy their models to JS environments (browser, mobile) but with a familiar API.

Generic math operations are powered by Tensorflow.js core layer for faster calculation.

Documentation site: www.scikitjs.org

135392530-81ed4901-10fc-4d74-9fec-da8c968573f5

Installation

Frontend Users

For use with modern bundlers in a frontend application, simply

npm i @tensorflow/tfjs scikitjs

We depend on the tensorflow library in order to make our calculations fast, but we don't ship it in our bundle. We use it as a peer dependency. General usage is as follows.

import * as tf from '@tensorflow/tfjs'
import * as sk from 'scikitjs'
sk.setBackend(tf)

This allows us to build a library that can be used in Deno, Node, and the browser with the same configuration.

Backend Users

For Node.js users who wish to bind to the Tensorflow C++ library, simply import the tensorflow C++ version, and use that as the tf library

npm i @tensorflow/tfjs-node scikitjs
const tf = require('@tensorflow/tfjs-node')
const sk = require('scikitjs')
sk.setBackend(tf)

Note: If you have ESM enabled (by setting type="module" in your package.json), then you can consume this library with import / export, like in the following code block.

import * as tf from '@tensorflow/tfjs-node'
import * as sk from 'scikitjs'
sk.setBackend(tf)

Script src

For those that wish to use script src tags, simply

<script type="module">
  import * as tf from 'https://cdn.skypack.dev/@tensorflow/tfjs'
  import * as sk from 'https://cdn.skypack.dev/scikitjs'
  sk.setBackend(tf)

  // or alternatively you can pull the bundle from unpkg 
  // import * as sk from "https://unpkg.com/scikitjs/dist/web index.min.js"
</script>

Simple Example

import * as tf from '@tensorflow/tfjs'
import { setBackend, LinearRegression } from 'scikitjs'
setBackend(tf)

const lr = new LinearRegression({ fitIntercept: false })
const X = [[1], [2]] // 2D Matrix with a single column vector
const y = [10, 20]

await lr.fit(X, y)

lr.predict([[3], [4]]) // roughly [30, 40]
console.log(lr.coef)
console.log(lr.intercept)

Coming from Python?

This library aims to be a drop-in replacement for scikit-learn but for JS environments. There are some differences in deploy environment and underlying libraries that make for a slightly different experience. Here are the 3 main differences.

1. Class constructors take in objects. Every other function takes in positional arguments.

While I would have liked to make every function identical to the python equivalent, it wasn't possible. In python, one has named arguments, meaning that all of these are valid function calls.

Python

def myAdd(a=0, b=100):
  return a+b

print(myAdd()) # 100
print(myAdd(a=10)) # 110
print(myAdd(b=10)) # 10
print(myAdd(b=20, a=20)) # 40 (order doesn't matter)
print(myAdd(50,50)) # 100

Javascript doesn't have named parameters, so one must choose between positional arguments, or passing in a single object with all the parameters.

For many classes in scikit-learn, the constructors take in a ton of arguments with sane defaults, and the user usually only specifies which one they'd like to change. This rules out the positional approach.

After a class is created most function calls really only take in 1 or 2 arguments (think fit, predict, etc). In that case, I'd rather simply pass them positionally. So to recap.

Python

from sklearn.linear_model import LinearRegression

X, y = [[1],[2]], [10, 20]
lr = LinearRegression(fit_intercept = False)
lr.fit(X, y)

Turns into

JavaScript

import * as tf from '@tensorflow/tfjs'
import { setBackend, LinearRegression } from 'scikitjs'
setBackend(tf)

let X = [[1], [2]]
let y = [10, 20]
let lr = new LinearRegression({ fitIntercept: false })
await lr.fit(X, y)

You'll also notice in the code above, these are actual classes in JS, so you'll need to new them.

2. underscore_case turns into camelCase

Not a huge change, but every function call and variable name that is underscore_case in python will simply be camelCase in JS. In cases where there is an underscore but no word after, it is removed.

Python

from sklearn.linear_model import LinearRegression

X, y = [[1],[2]], [10, 20]
lr = LinearRegression(fit_intercept = False)
lr.fit(X, y)
print(lr.coef_)

Turns into

JavaScript

import * as tf from '@tensorflow/tfjs'
import { setBackend, LinearRegression } from 'scikitjs'
setBackend(tf)

let X = [[1], [2]]
let y = [10, 20]
let lr = new LinearRegression({ fitIntercept: false })
await lr.fit(X, y)
console.log(lr.coef)

In the code sample above, we see that fit_intercept turns into fitIntercept (and it's an object). And coef_ turns into coef.

3. Always await calls to .fit or .fitPredict

It's common practice in Javascript to not tie up the main thread. Many libraries, including tensorflow.js only give an async "fit" function.

So if we build on top of them our fit functions will be asynchronous. But what happens if we make our own estimator that has a synchronous fit function? Should we burden the user with finding out if their fit function is async or not, and then "awaiting" the proper one? I think not.

I think we should simply await all calls to fit. If you await a synchronous function, it resolves immediately and you are on your merry way. So I literally await all calls to .fit and you should too.

Python

from sklearn.linear_model import LogisticRegression

X, y = [[1],[-1]], [1, 0]
lr = LogisticRegression(fit_intercept = False)
lr.fit(X, y)
print(lr.coef_)

Turns into

JavaScript

import * as tf from '@tensorflow/tfjs'
import { setBackend, LogisticRegression } from 'scikitjs'
setBackend(tf)

let X = [[1], [-1]]
let y = [1, 0]
let lr = new LogisticRegression({ fitIntercept: false })
await lr.fit(X, y)
console.log(lr.coef)

Contribution Guide

See guide

changelog

1.24.0 (2022-05-22)

Features

  • sgd classifier can not train on categorical variables, as well as one-hot encoded variables (10141cd)

1.23.0 (2022-05-19)

Features

  • added test case for custom callbacks. works great and somehow serializes. (7fa5c42)
  • custom modelfitargs for linear models (2ddcad9)

1.22.0 (2022-05-18)

Features

  • added back in logistic regression tests (dc2ec4a)
  • first pass at removing tensorflow from bundle (7562da2)
  • more tests moved over (76509d8)
  • removed hard dependency on tensorflow (0f2736e)
  • removed unneeded build steps (d3814ca)
  • updated serialize / deserialize to avoid tfjs error (1bf508d)

1.21.0 (2022-05-08)

Features

1.20.0 (2022-04-26)

Features

1.19.0 (2022-04-26)

Features

  • changed lodash imports to support building on esm.sh (3eabad9)

1.18.0 (2022-04-26)

Features

  • removed seedrandom in favor of inlining to help build on esm.sh (245d49c)

1.17.0 (2022-04-21)

Features

  • added automated tests to test our code in the browser (87e06a2)
  • renamed files, added to repo (9a63da4)

1.16.0 (2022-04-19)

Features

  • fixed loadBoston calls. Need to do the others (05c9d9a)
  • fixed tests (3f6654d)
  • remove data loading logic in favor of using dfd.readCSV(url) (3251738)

1.15.0 (2022-04-18)

Features

  • remove rollup from the build process, replace with esbuild (1f16ef8)
  • updated readme (7e70aba)

1.14.0 (2022-04-17)

Features

  • commented out tests (77b6ab6)
  • commenting out svc, svr code until it can be built in browser (dd95256)
  • disable libsvm until we can ship to the browser (fdc3214)
  • updated tests (6938b32)

1.13.0 (2022-04-17)

Features

1.12.0 (2022-04-17)

Features

  • only import from tensorflow and not subpackages (f971942)

1.11.0 (2022-04-17)

Features

1.10.0 (2022-04-17)

Features

  • removed danfo as a dependency (b8b5578)

1.9.0 (2022-02-27)

Bug Fixes

Features

  • add custom serializer to sgdclassifier (wip) (9c3f3dc)
  • add loss and optimizer type to enable easy parsing (08809ec)
  • add loss types and initializer types (33e1d2c)
  • add more estimators and makes serializer flexible for ensembles and pipeline (e2d319b)
  • add optimizer, loss and intializer caller (8778f73)
  • add serializer to criterion (8cbb737)
  • add serializer to criterion (92f765e)
  • add serializer to decision tree and update test (83ef949)
  • add serializer to kNeighborBase (5314044)
  • add serializer to labelencoder (6a1c362)
  • add serializer to linear model base class (9701a3e)
  • add serializer to NaiveBayes (84e747e)
  • add serializer to pipeline (6c425e1)
  • add serializer to splitter (763d7e1)
  • add serializer to SVC AND SVR (672328f)
  • add serializer to votingclassifier (2a21f31)
  • add serializer to votingRegressor (6d59faf)
  • allow ClassifierMixin to extends Serialize class (48d15f1)
  • allow Kmeans to inherit from serializer (6b36f4c)
  • implement generic class to Serialize models and transformers e.t.c (7f617bc)
  • implement serialize ensembles for ensemble class (bbd9fac)
  • make TransformerMixin and RegressorMixin extends serialize (5a001fd)
  • update linear model with new args to enable easy serialization (ec559e1)
  • update serialize to easily parse tensors (232cb62)
  • update Serialize to handle serialization of tensors (924b050)
  • update serialize to return inherited class (0733023)
  • update serializer for sgdclassifier (4971b3c)

1.8.0 (2022-01-28)

Bug Fixes

  • k-neighbors-regressor now supports no params (9656d6b)
  • kd-tree index issue fix + docs (5eba76c)
  • kd-tree protection copy + tfjs-core import (5c4348d)

Features

  • k-neighbors now lists available algorithms (fcfcb87)

1.7.0 (2022-01-23)

Bug Fixes

  • cross-val-score and k-fold fixes+improvements (21a566b)
  • cross-val-score api improvement etc (efe63f9)
  • k-fold memory leak (2f5529d)

Features

  • cross-val-score and k-fold implemented (6bc3ee3)
  • rand-utils create-rng (553232f)

1.6.0 (2022-01-21)

Features

1.5.0 (2022-01-20)

Features

  • add makeRegression function (5337ecf)

1.4.0 (2022-01-18)

Features

  • added ability for decision tree to handle negative input (a6cf53f)
  • first pass at decision tree classifier (550551e)
  • first pass at regression tree (849469a)

1.3.0 (2022-01-14)

Features

  • gaussian naive bayes classifier (8174ae1)
  • gaussian naive bayes classifier (d520b1a)

1.2.0 (2022-01-02)

Features

  • seeing if this package.json exports does the trick (4a73f7c)

1.1.0 (2021-12-31)

Bug Fixes

  • added fast-check dev dependency (fe9e693)
  • change max length on commit message (fe4ce57)
  • change max length on commit message (f4a8672)
  • commented out tests failing in test:browser (0fe0fe1)
  • k-neighbors-classifier await super.fit() (01632f4)

Features

  • k-neighbors kd-tree algorithm (59d40de)
  • kd-tree first draft (354979a)

Performance Improvements

  • k-neighbors kd-tree performance improvements (158506c)

1.1.0 (2021-12-31)

Bug Fixes

  • added fast-check dev dependency (fe9e693)
  • change max length on commit message (f4a8672)
  • commented out tests failing in test:browser (0fe0fe1)
  • k-neighbors-classifier await super.fit() (01632f4)

Features

  • k-neighbors kd-tree algorithm (59d40de)
  • kd-tree first draft (354979a)

Performance Improvements

  • k-neighbors kd-tree performance improvements (158506c)

1.0.3 (2021-12-31)

Bug Fixes

  • fixing any type to correct usage (0084771)

1.0.2 (2021-12-31)

Bug Fixes

  • fixing any type to correct usage (3f5c288)

1.0.1 (2021-12-31)

Bug Fixes

  • fixing any type to correct usage (4496805)

1.0.0 (2021-12-31)

Bug Fixes

  • broken UMD browser script (10f2e34)
  • fix lint (28c876d)
  • k-neighbors inverse distance weighting (a162baa)
  • k-neighbors predict now checks n_features (35efd93)

Features

  • add SVC (a5fe596)
  • added cut 1 of voting classifier (4045b81)
  • added tests and basic implementation of votingregressor (d7011e7)
  • added voting classifier (d7ab9c6)
  • broke out sgdlinear into sgdregressor and sgdclassifier (81fbee8)
  • changed imports (390375c)
  • finish (825ebb7)
  • First pass at VotingRegressor (ffb3393)
  • implemented kNeighborsRegression (a1a7174)
  • import libsvm (f0f0cc8)
  • k-neighbors regressor (94f6a69)
  • k-neighbors regressor (225e167)
  • k-neighbors-classifier implemented (d120257)
  • k-neighbors-regressor (050cec6), closes #111
  • k-neighbors-regressor (cb0a8b0)
  • linear svr (7a0534d)
  • simple first pass addition of linear-svc (483117d)
  • train test split implementation (97b89a5)
  • updated index to export linear-svc and updated docs (b5c116b)

1.0.0 (2021-12-31)

Bug Fixes

  • broken UMD browser script (10f2e34)
  • fix lint (28c876d)
  • k-neighbors inverse distance weighting (a162baa)
  • k-neighbors predict now checks n_features (35efd93)

Features

  • add SVC (a5fe596)
  • added cut 1 of voting classifier (4045b81)
  • added tests and basic implementation of votingregressor (d7011e7)
  • added voting classifier (d7ab9c6)
  • broke out sgdlinear into sgdregressor and sgdclassifier (81fbee8)
  • changed imports (390375c)
  • finish (825ebb7)
  • First pass at VotingRegressor (ffb3393)
  • implemented kNeighborsRegression (a1a7174)
  • import libsvm (f0f0cc8)
  • k-neighbors regressor (94f6a69)
  • k-neighbors regressor (225e167)
  • k-neighbors-classifier implemented (d120257)
  • k-neighbors-regressor (050cec6), closes #111
  • k-neighbors-regressor (cb0a8b0)
  • linear svr (7a0534d)
  • simple first pass addition of linear-svc (483117d)
  • train test split implementation (97b89a5)
  • updated index to export linear-svc and updated docs (b5c116b)

1.0.0 (2021-12-31)

Bug Fixes

  • broken UMD browser script (10f2e34)
  • fix lint (28c876d)
  • k-neighbors inverse distance weighting (a162baa)
  • k-neighbors predict now checks n_features (35efd93)

Features

  • add SVC (a5fe596)
  • added cut 1 of voting classifier (4045b81)
  • added tests and basic implementation of votingregressor (d7011e7)
  • added voting classifier (d7ab9c6)
  • broke out sgdlinear into sgdregressor and sgdclassifier (81fbee8)
  • changed imports (390375c)
  • finish (825ebb7)
  • First pass at VotingRegressor (ffb3393)
  • implemented kNeighborsRegression (a1a7174)
  • import libsvm (f0f0cc8)
  • k-neighbors regressor (94f6a69)
  • k-neighbors regressor (225e167)
  • k-neighbors-classifier implemented (d120257)
  • k-neighbors-regressor (050cec6), closes #111
  • k-neighbors-regressor (cb0a8b0)
  • linear svr (7a0534d)
  • simple first pass addition of linear-svc (483117d)
  • train test split implementation (97b89a5)
  • updated index to export linear-svc and updated docs (b5c116b)

1.0.0 (2021-12-31)

Bug Fixes

  • broken UMD browser script (10f2e34)
  • fix lint (28c876d)
  • k-neighbors inverse distance weighting (a162baa)
  • k-neighbors predict now checks n_features (35efd93)

Features

  • add SVC (a5fe596)
  • added cut 1 of voting classifier (4045b81)
  • added tests and basic implementation of votingregressor (d7011e7)
  • added voting classifier (d7ab9c6)
  • broke out sgdlinear into sgdregressor and sgdclassifier (81fbee8)
  • changed imports (390375c)
  • finish (825ebb7)
  • First pass at VotingRegressor (ffb3393)
  • implemented kNeighborsRegression (a1a7174)
  • import libsvm (f0f0cc8)
  • k-neighbors regressor (94f6a69)
  • k-neighbors regressor (225e167)
  • k-neighbors-classifier implemented (d120257)
  • k-neighbors-regressor (050cec6), closes #111
  • k-neighbors-regressor (cb0a8b0)
  • linear svr (7a0534d)
  • simple first pass addition of linear-svc (483117d)
  • train test split implementation (97b89a5)
  • updated index to export linear-svc and updated docs (b5c116b)

1.0.0 (2021-12-31)

Bug Fixes

  • broken UMD browser script (10f2e34)
  • fix lint (28c876d)
  • k-neighbors inverse distance weighting (a162baa)
  • k-neighbors predict now checks n_features (35efd93)

Features

  • add SVC (a5fe596)
  • added cut 1 of voting classifier (4045b81)
  • added tests and basic implementation of votingregressor (d7011e7)
  • added voting classifier (d7ab9c6)
  • broke out sgdlinear into sgdregressor and sgdclassifier (81fbee8)
  • changed imports (390375c)
  • finish (825ebb7)
  • First pass at VotingRegressor (ffb3393)
  • implemented kNeighborsRegression (a1a7174)
  • import libsvm (f0f0cc8)
  • k-neighbors regressor (94f6a69)
  • k-neighbors regressor (225e167)
  • k-neighbors-classifier implemented (d120257)
  • k-neighbors-regressor (050cec6), closes #111
  • k-neighbors-regressor (cb0a8b0)
  • linear svr (7a0534d)
  • simple first pass addition of linear-svc (483117d)
  • train test split implementation (97b89a5)
  • updated index to export linear-svc and updated docs (b5c116b)

1.0.0 (2021-12-31)

Bug Fixes

  • broken UMD browser script (10f2e34)
  • fix lint (28c876d)
  • k-neighbors inverse distance weighting (a162baa)
  • k-neighbors predict now checks n_features (35efd93)

Features

  • add SVC (a5fe596)
  • added cut 1 of voting classifier (4045b81)
  • added tests and basic implementation of votingregressor (d7011e7)
  • added voting classifier (d7ab9c6)
  • broke out sgdlinear into sgdregressor and sgdclassifier (81fbee8)
  • changed imports (390375c)
  • finish (825ebb7)
  • First pass at VotingRegressor (ffb3393)
  • implemented kNeighborsRegression (a1a7174)
  • import libsvm (f0f0cc8)
  • k-neighbors regressor (94f6a69)
  • k-neighbors regressor (225e167)
  • k-neighbors-classifier implemented (d120257)
  • k-neighbors-regressor (050cec6), closes #111
  • k-neighbors-regressor (cb0a8b0)
  • linear svr (7a0534d)
  • simple first pass addition of linear-svc (483117d)
  • train test split implementation (97b89a5)
  • updated index to export linear-svc and updated docs (b5c116b)