Skip to content

Releases: spotify/luigi

Version 2.6.0

10 Feb 06:59
Compare
Choose a tag to compare

Luigi 2.6.0 comes with many new cool features!

  • Removed deprecations! luigi.{hadoop, hadoop_jar, hdfs, hive, scalding, webhdfs} are removed, use luigi.contrib.{..} instead #1995
  • Deprecations! luigi.{postgres, s3} are now moved into luigi.contrib. #1997
  • Multiple workers finally works for Windows again! #1992
  • Server can now communicate with the clients! We started small and implemented so --workers can be set. #1993
  • Make the visualizer put your search queries in the URLs hash. So links are now finally shareable! #1986 #2002
  • A new recommended way to automatically set the task namespace! #2000 (docs)

There have been a lot of other bugfixes, docsfixes, cleanups and added testcases! See all commits
here.

Version 2.5.0

10 Jan 07:57
Compare
Choose a tag to compare

This releases contains mostly bugfixes, but also changes to the otherwise quite stale luigi core.

Most users will probably not have anything break. But at least read the
warnings placed below to see what could've have changed for you.

luigi:
  • Changed behavior warning! BooleanParameter is now removed after a long deprecation. Instead simply use BoolParameter. #1959
  • Make luigi Task classes more pythonic and functional:
    • Changed behavior warning! task_namespace is now inherited as usual in python and not overridden my metamagic from luigi. #1950 (Thanks @riga).
    • Changed behavior warning! externalize now goes out of it's way to ensure it doesn't mutate its input anymore, and returns a copy. Allowing for new cool usage patterns. #1975 (docs) (shameless thanks @Tarrasch :p)
    • Concepts like task namespace and friends are now documented. Curious folks can read the new docs. :)
  • Further bigquery improvements from @spotify engineers: #1896 #1946 (Thanks @ukarlsson and @fabriziodemaria and more)
  • Various bugfixes:
    • Fix serialization of TimeDeltaParameter #1968 (Thanks @kierkegaard13)
    • Fix execution summary and return codes for succesfully retried tasks #1951 (Thanks @bwtakacy)

There have been a lot of other bugfixes, docsfixes, cleanups and added testcases! See all commits
here.

Version 2.4.0

02 Dec 05:05
Compare
Choose a tag to compare

This release come with a few new features and some changed behaviors. Hopefully bringing us a tiny step towards scheduling heaven.

luigi:
luigi.scheduler:
  • Reverted the behavior introduced in 2.3.0. #1926

There have been a lot of other features, cleanups and bugfixes! See all commits
here.

Version 2.3.3

21 Oct 04:12
Compare
Choose a tag to compare

Biggest risk of breakage for people updating early:

  • File locking strategy just got changed (on Unix) #1886 Thanks @nmandery

Other things:

And many more other small improvements. Thanks to everyone who've contributed!

Version 2.3.2

20 Sep 09:50
Compare
Choose a tag to compare

This is mostly a bug-fix release.


  • Changed behaviour (read bugfix) in local locking #1842
  • Changed behaviour (read bugfix) in disabling workers #1839
  • Many bugfixes to the scheduler, particularly related to the Batch running functionality.

Here are the changes commit by commit.

Version 2.3.1

25 Aug 02:42
Compare
Choose a tag to compare

This release mainly fixes bugs introduced in the two latest releases and also
some older bugs.


  • Bugfix regarding sftp #1825
  • Bugfix regarding error emails with smtp #1821
  • Bugfix regarding spark tasks #1819
  • Bugfixes regarding visualiser #1817 #1818

Here are the changes commit by commit.

Version 2.3.0

12 Aug 03:20
Compare
Choose a tag to compare

There's been over a month since the last release. This new release includes a
bunch of new features. What I like the most is that they all come with full and
proper documentation!

luigi:
luigi.scheduler:
  • We now have a new definition of the UPSTREAM statuses. New intuition is that
    UPSTREAM_FAILED means that that task cannot run because all downstream
    tasks have failed or worse (like being disabled). As an effect there will be
    much much fewer tasks considered to have an upstream status. #1789

There have been a lot of other features, cleanups and bugfixes! See all commits
here.

Version 2.2.0

08 Jul 10:55
Compare
Choose a tag to compare

There's been 3 months since the latest release. Making Google's results on readthedocs outdated and gives a stale feeling to luigi. Enjoy updated and hopefully bugfree software. :)

At least read these

  • Luigi finally has user-land configurable task status messages #1625
  • Parameters: From now on, you must not pass None as the default for a
    parameter. Usually, passing the empty string '' is a sufficient
    replacement. If you don't do this luigi will print a deprecation warning.
    #1624.
  • Logging for server: Things are greatly improved
    now as of #1633
    and #1636.
    Here's my jotted down usage info about it here
    #1752 (comment).

Main changes

luigi:
  • More fine grained eventhandlers #1698
  • Range: Finally a proper way to pass along parameters: #1675
  • From this release, we'll also bump the debian verion number. #1718
  • Print your dependency tree as ascii art! #1680
  • We now have a template for PR's! #1655
luigi.contrib:
  • AWS: You can now set the session token: https://github.com/spotify/luigi/pull/1702/files
  • Salesforce: Add support for multiple results #1686
  • FTP: Configurable port #1689
  • MSSQL support: #1650
  • Streaming mapreduce: Allow additional archives #1649
  • Streaming mapreduce: Recognize the Google File System formats #1664
  • Streaming mapreduce: mrrunner.py is not hardcoded as the binary being run #1565

Various goodies

Contribution spirit

A few great examples that show how improvements are well-receieved no matter how small they are. Yet these "small" changes helps hundreads of people reading the docs of luigi. #1672 #1642

Other changes

There were even more changes which we didn't include in these release notes. Like every contribution, whether merged or not we are happy for getting them. So please keep contributing! :)

Version 2.1.1 (Includes security fix)

06 Apr 07:35
Compare
Choose a tag to compare

Last release was only 2 business days ago (as opposed to the 5 months since the one before it). But this release got rushed as of a security fix!

In addition to doc fixes:

Additions

Security bugfix

  • The server now have an explicit whitelist of external commands.
    • Previous potential harm: Malicious hackers can run arbitrary code if they have file system (even external mounts!)+network access on the machine running luigid (executed by the user that you run luigid with).

We wait for a while with saying how to use this exploit, giving time to people to apply the bugfix.

Version 2.1.0

01 Apr 20:59
Compare
Choose a tag to compare

Finally, a new PyPI release in a rather long while. Thanks @Tarrasch and @erikbern, and all the contributors!

Added

luigi:
  • Notifications: more emails and proper coloring (#1471), improved SMTP handling
  • EnumParameter (#1479), DictParameter (#1574)
  • Support for Python 3.5 (#1494)
  • Process locking on Alpine Linux (#1530) and Windows (#1557)
  • Visualizer: resources tab (#1566), GUI functionality to disable a worker (#1564)
luigi.contrib:
  • ExternalBigqueryTask (#1434), BigqueryCreateViewTask (#1465)
  • Luigi tasks for Dataproc, Google's managed Hadoop MapReduce, Spark, Pig, and Hive service (#1601)
  • ExternalProgramTask, ExternalPythonProgramTask - commonalities for running any external application or script (#1520)
  • Support for SFTP (#1585)
  • Sped-up Hive client using Metastore (#1533)
  • OpenerTarget, a single Luigi target to open multiple file system types (#1555)
  • Query base task (giving rise also to luigi.contrib.redshift.RedshiftQuery and luigi.postgres.PostgresQuery) (#1493)
  • RedshiftUnloadTask (#1527)
  • UploadToSalesforceTask (#1404)
  • Support for S3 assumed role (#1596)

Changed

luigi:
  • Semi-opaque, hashed task_id (as opposed to TaskName(param1=value1, param2=foo bar)) (#1444)
  • More explicit way to handle timelike parameters (date vs datetime) (#1473)
  • Optimizations in scheduler algorithm

Removed

luigi:
  • Old deprecated (2014) stuff around scheduler and its state (#1592)
luigi.contrib:
  • Deprecated classes SparkJob, Spark1xBackwardCompat, Spark1xJob, PySpark1xJob (#1442)

Fixed

luigi.contrib:
  • Ensure that FTP RemoteTarget successfully creates temporary files (meaning, in a directory relative to output) (#1515)
  • Remove superfluous init_mapper()/init_reducer() calls in LocalJobRunner (#1475)
  • Humanly format HadoopJobError (#1528)
  • Broken Redshift table creation (#1453)
  • Improved Salesforce reliability (#1597, #1600)
  • Missing call to post_copy() (#1502)

...and a slew of other additions, fixes, improvements and documentation.