Machine Learning as a Service with firefly

  • Thomas Ebermann

I have recently discovered firefly, a very handy alternative for data scientists, who want to publish their machine learning models fast on the web as a service and don't want to spend hours with devops and devs to just deploy a new iteration. Let me show you.

So I know there is yhat science ops which is a product exactly for this problem, but that solution is a bit pricy and maybe not the right thing if you want to prototype something really quick. There is, of course, the option to use your own servers and wrap your ml model in a thin layer of Flask as I have show in a recommender example for Slack before. But now there is an even easier solution using firefly and Heroku, that offers you a possibility to deploy your prototypes basically for free.


You can easily install firefly with pip

pip install firefly-python

Once its installed (I've been using python 2.7 - shame on me) you should be able to test it with:

firefly -h

Hello World Example

So we could write a simple function that returns the result of two numbers:
def add(x,y):
  return x+y

and then run it locally with firefly:

firefly example.add
2018-02-28 15:25:36 firefly [INFO] Starting Firefly...

The cool thing is now that the function is available at and you could use it with curl. Make sure to still run the firefly server from another tab.

firefly curl -d '{"x": 4, "y": 5}'

or even with the built in client:

import firefly
client = firefly.Client("")


For any real-world example, you will need to use authentication. This is actually also quite easy with firefly. You simply supply an API token when starting it up:

firefly  example.add --token plotti1234

Using the firefly client you can easily authenticate with:

client = firefly.Client("",auth_token="plotti1234")

If you don't supply it, you will get a:

firefly.client.FireflyError: Authorization token mismatch.

Of course, you can still use curl to do the same:

curl -d '{"x": 6,"y":5}' -H "Authorization: Token plotti1234"

Going to production

Config File

You can also use a config.yml file, to supply all of these parameters

# config.yml
version: 1.0
token: "plotti1234"
    path: "/add"
    function: "example.add"

and then start firefly with:

firefly -c config.yml

Training a model and dumping it onto drive

Now you can train a model and dump it to drive with scikit with joblib. You can easily load it with firefly and serve it under a route. First, let's train a hello world tree model with the iris dataset and dump it to drive:

from sklearn import tree
from sklearn import datasets
from sklearn.externals import joblib

#Load dataset
iris = datasets.load_iris()
X, Y =,
#Pick a model
clf = tree.DecisionTreeClassifier()
clf =, Y)
# Try it out
array([[5.1, 3.5, 1.4, 0.2]])
array([0]) # result of classification
#Dump it to drive
joblib.dump(clf, 'iris.pkl') 

You can then load this model in firefly as a function and you are done:

from sklearn.externals import joblib
clf = joblib.load('iris.pkl')
def predict(a):
    predicted = clf.predict(a)    # predicted is 4x2 numpy array
    return int(predicted[0])

To start it up you use the conventional method:

firefly iris.predict

And now, you can access your trained model simply by the client or curl:

import firefly
client = firefly.Client("")
client.predict(a=[[5.1, 3.5, 1.4, 0.2]]) # the same values as above
0 # the same result yeay!

Deploy it to Heroku!

To deploy it to heroku you need to add two files. A Procfile that says how to run our app, and a requirements.txt file that says which libraries it will be using. It's quite straightforward for the requirements.txt:


And for the procfile you can use gunicorn to run it and supply the functions that you want to use as environment parameters:

# Procfile
web: gunicorn --preload firefly.main:app -e FIREFLY_FUNCTIONS="iris.predict" -e FIREFLY_TOKEN="plotti1234"

The only thing left to do is commit it to git and deploy it to heroku:

git init
git add . 
git commit -m "init"
heroku login # to login into your heroku account. 
heroku create # to create the app

The final step is the deployment which is done via git push in heroku :

git push heroku master
Counting objects: 3, done.
Delta compression using up to 8 threads.
Compressing objects: 100% (3/3), done.
Writing objects: 100% (3/3), 279 bytes | 279.00 KiB/s, done.
Total 3 (delta 2), reused 0 (delta 0)
remote: Compressing source files... done.
remote: Building source:
remote: -----> Python app detected
remote: -----> Installing requirements with pip
remote: -----> Discovering process types
remote:        Procfile declares types -> web
remote: -----> Compressing...
remote:        Done: 119.6M
remote: -----> Launching...
remote:        Released v7
remote: deployed to Heroku
remote: Verifying deploy... done.
   985a4c3..40726ee  master -> master

Test it

Now you've got a running machine learning model on heroku for free! You can try it out via curl. Notice I've wrapped the array as a string representation to make things easy.

 curl -d '{"a":"[[5.1, 3.5, 1.4, 0.2]]"}' -H "Authorization: Token plotti1234"

You can of course also use the firefly client:

client = firefly.Client("",auth_token="plotti1234")
client.predict(a=[[5.1, 3.5, 1.4, 0.2]])

Bonus: Multithreading and Documentation

Since we are using gunicorn you can easily start 4 workers and your API should respond better to a high load. Change your Procfile to:

web: gunicorn --workers 4 firefly.main:app -e FIREFLY_FUNCTIONS="iris.predict" -e FIREFLY_TOKEN="plotti1234"

Finally there is only crude support for an apidoc style documentation. But when you do a GET request to your root / of your app you will get a listing of the docstrings from your code. So hopefully in the future they will also support apidoc or swagger to make the usage of such an API even more convenient:

curl -H "Authorization: Token plotti1234"
{"app": "firefly", "version": "0.1.11", "functions": {"predict": {"path": "/predict", "doc": "\n    @api {post} /predict\n    @apiGroup Predict\n    @apiName PredictClass\n\n    @apiDescription This function predicts the class of iris.\n    @apiSampleRequest /predict\n    ", "parameters": [{"name": "a", "kind": "POSITIONAL_OR_KEYWORD"}]}}}

I highly recommend this still young project, because it really reduces deploying a new model to a git push heroku master for me for prototypes. There are obviously some things missing like extensive logging, performance benchmarking , various methods of authentication, better support for docs. Yet its so much fun to deploy models in such a convenient way.

Tell us what you think