Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs(enums): fix types #1224

Merged
merged 29 commits into from
Oct 14, 2024
Merged

docs(enums): fix types #1224

merged 29 commits into from
Oct 14, 2024

Conversation

bonjourmauko
Copy link
Member

@bonjourmauko bonjourmauko commented Sep 19, 2024

Fixes #1233
Fixes #1271 (closes #1272)
Fixes #1267 (closes #1273)
Depends on #1223
Depended upon by #1146

Technical changes

  • Fix type definitions in the enums module

@bonjourmauko bonjourmauko added level:intermediate Requires average OpenFisca experience kind:refactor Refactoring and code cleanup labels Sep 19, 2024
@bonjourmauko bonjourmauko requested review from a team September 19, 2024 10:37
@bonjourmauko bonjourmauko self-assigned this Sep 19, 2024
@bonjourmauko bonjourmauko changed the base branch from master to fix-mypy-checks-periods September 19, 2024 10:37
@coveralls
Copy link

coveralls commented Sep 19, 2024

Coverage Status

coverage: 82.257% (+0.5%) from 81.804%
when pulling 9b7d9c9 on fix-mypy-checks-enums
into df9e9d7 on master.

@bonjourmauko bonjourmauko changed the title [5/17] Fix doc & type definitions in the enums module docs: fix doc & types in enums Sep 27, 2024
Base automatically changed from fix-mypy-checks-periods to master September 30, 2024 19:52
@bonjourmauko bonjourmauko added kind:docs Add or improvement of documentation and removed kind:refactor Refactoring and code cleanup labels Sep 30, 2024
@bonjourmauko bonjourmauko changed the title docs: fix doc & types in enums docs(enums): fix types Oct 1, 2024
@bonjourmauko
Copy link
Member Author

Thanks @benjello , this fixes quite a few bugs :) Would you mind taking a look at the depending pull requests? I can't merge this one alone.

@bonjourmauko
Copy link
Member Author

For performance, I've run the following test:

#!/usr/bin/env python

from pyinstrument import Profiler

from openfisca_core import indexed_enums as enum

import numpy


class MyEnum(enum.Enum):
    A = 1
    B = 2
    C = 3
    D = 4
    E = 5
    F = 6
    G = 7
    H = 8
    I = 9
    J = 10
    K = 11
    L = 12
    M = 13
    N = 14
    O = 15
    P = 16
    Q = 17
    R = 18
    S = 19
    T = 20
    U = 21
    V = 22
    W = 23
    X = 24
    Y = 25
    Z = 26


array = [[1], ["A"], [3], [4], [MyEnum.A], [6], ["B", "E"], [1, 2, 3], list(MyEnum)]
ndarray = [numpy.array(arr) for arr in array]
generator = (
    [
        MyEnum.encode(arr) == arr,
        MyEnum.encode(arr).decode(),
        MyEnum.encode(arr).decode_to_str(),
    ]
    for arr in array
    for _ in range(1000)
)

if __name__ == "__main__":
    profiler = Profiler(use_timing_thread=True, interval=0.0001)
    profiler.start()
    list(generator)
    profiler.stop()
    print(profiler.output_text(unicode=True, color=True))
Before
git:(master) ✗ ./enum-perf.py

  _     ._   __/__   _ _  _  _ _/_   Recorded: 07:09:30  Samples:  19562
 /_//_/// /_\ / //_// / //_'/ //     Duration: 2.542     CPU time: 2.908
/   _/                      v5.0.0

Profile at /Volumes/NO NAME/Sites/openfisca/openfisca-core/./enum-perf.py:53

2.542 <module>  enum-perf.py:1
└─ 2.538 <genexpr>  enum-perf.py:41
   ├─ 1.259 EnumArray.decode  openfisca_core/indexed_enums/enum_array.py:67
   │  ├─ 0.702 select  numpy/lib/function_base.py:768
   │  │     [7 frames hidden]  <built-in>, numpy
   │  ├─ 0.331 EnumArray.__eq__  openfisca_core/indexed_enums/enum_array.py:36
   │  │  ├─ 0.274 [self]  openfisca_core/indexed_enums/enum_array.py
   │  │  └─ 0.032 EnumArray.view  <built-in>
   │  ├─ 0.175 [self]  openfisca_core/indexed_enums/enum_array.py
   │  └─ 0.028 <genexpr>  enum.py:820
   ├─ 1.071 EnumArray.decode_to_str  openfisca_core/indexed_enums/enum_array.py:85
   │  ├─ 0.522 select  numpy/lib/function_base.py:768
   │  │     [7 frames hidden]  numpy, <built-in>
   │  ├─ 0.334 EnumArray.__eq__  openfisca_core/indexed_enums/enum_array.py:36
   │  │  ├─ 0.280 [self]  openfisca_core/indexed_enums/enum_array.py
   │  │  └─ 0.029 EnumArray.view  <built-in>
   │  ├─ 0.125 [self]  openfisca_core/indexed_enums/enum_array.py
   │  ├─ 0.044 property.__get__  enum.py:202
   │  └─ 0.027 <genexpr>  enum.py:820
   ├─ 0.146 MyEnum.encode  openfisca_core/indexed_enums/enum.py:29
   │  └─ 0.118 EnumArray.__new__  openfisca_core/indexed_enums/enum_array.py:20
   │     └─ 0.099 asarray  <built-in>
   └─ 0.049 EnumArray.__eq__  openfisca_core/indexed_enums/enum_array.py:36
After
git:(fix-mypy-checks-enums) ✗ ./enum-perf.py

  _     ._   __/__   _ _  _  _ _/_   Recorded: 07:08:36  Samples:  1783
 /_//_/// /_\ / //_// / //_'/ //     Duration: 0.231     CPU time: 0.438
/   _/                      v5.0.0

Profile at /Volumes/NO NAME/Sites/openfisca/openfisca-core/./enum-perf.py:53

0.231 <module>  enum-perf.py:1
└─ 0.229 <genexpr>  enum-perf.py:41
   ├─ 0.139 MyEnum.encode  openfisca_core/indexed_enums/enum.py:142
   │  ├─ 0.108 MyEnum._encode_array_like  openfisca_core/indexed_enums/enum.py:222
   │  │  ├─ 0.020 _is_int_array_like  openfisca_core/indexed_enums/_guards.py:114
   │  │  │  ├─ 0.011 [self]  openfisca_core/indexed_enums/_guards.py
   │  │  │  └─ 0.007 <genexpr>  openfisca_core/indexed_enums/_guards.py:139
   │  │  ├─ 0.018 _is_enum_array_like  openfisca_core/indexed_enums/_guards.py:56
   │  │  │  ├─ 0.011 <genexpr>  openfisca_core/indexed_enums/_guards.py:83
   │  │  │  │  ├─ 0.006 isinstance  <built-in>
   │  │  │  │  └─ 0.005 [self]  openfisca_core/indexed_enums/_guards.py
   │  │  │  └─ 0.007 [self]  openfisca_core/indexed_enums/_guards.py
   │  │  ├─ 0.018 [self]  openfisca_core/indexed_enums/enum.py
   │  │  ├─ 0.018 EnumArray.__new__  openfisca_core/indexed_enums/enum_array.py:75
   │  │  │  ├─ 0.010 [self]  openfisca_core/indexed_enums/enum_array.py
   │  │  │  ├─ 0.005 EnumArray.__array_finalize__  openfisca_core/indexed_enums/enum_array.py:85
   │  │  │  └─ 0.002 ndarray.view  <built-in>
   │  │  ├─ 0.016 _int_to_index  openfisca_core/indexed_enums/_utils.py:63
   │  │  │  ├─ 0.006 [self]  openfisca_core/indexed_enums/_utils.py
   │  │  │  ├─ 0.004 MyEnum.__members__  enum.py:828
   │  │  │  └─ 0.004 array  <built-in>
   │  │  ├─ 0.008 _is_str_array_like  openfisca_core/indexed_enums/_guards.py:172
   │  │  ├─ 0.005 _enum_to_index  openfisca_core/indexed_enums/_utils.py:8
   │  │  │  └─ 0.003 array  <built-in>
   │  │  └─ 0.004 _str_to_index  openfisca_core/indexed_enums/_utils.py:126
   │  ├─ 0.014 [self]  openfisca_core/indexed_enums/enum.py
   │  ├─ 0.010 Sequence.__instancecheck__  <frozen abc>:117
   │  │     [2 frames hidden]  <built-in>, <frozen abc>
   │  └─ 0.005 isinstance  <built-in>
   ├─ 0.064 EnumArray.__eq__  openfisca_core/indexed_enums/enum_array.py:91
   │  ├─ 0.054 [self]  openfisca_core/indexed_enums/enum_array.py
   │  ├─ 0.005 MyEnum.__eq__  openfisca_core/indexed_enums/enum.py:126
   │  │  ├─ 0.002 isinstance  <built-in>
   │  │  └─ 0.002 [self]  openfisca_core/indexed_enums/enum.py
   │  └─ 0.003 isinstance  <built-in>
   ├─ 0.012 [self]  enum-perf.py
   ├─ 0.007 EnumArray.decode  openfisca_core/indexed_enums/enum_array.py:248
   └─ 0.006 EnumArray.decode_to_str  openfisca_core/indexed_enums/enum_array.py:283

@benoit-cty
Copy link
Contributor

benoit-cty commented Oct 14, 2024

Congratulation, this is a huge improvement !
How typing could speed things up as I understand that Python do not use type anotations at runtime ?

EDIT : OK seing your comment it's because now it's relying more on Numpy instead of pure Python. Make sens.

@bonjourmauko
Copy link
Member Author

Congratulation, this is a huge improvement !
How typing could speed things up as I understand that Python do not use type anotations at runtime ?

EDIT : OK seing your comment it's because now it's relying more on Numpy instead of pure Python. Make sens.

Yes, it is not typing itself, but thanks to the information you get from having things typed.

@benjello
Copy link
Member

@bonjourmauko @benoit-cty : I strongly recommend running this branch on openfisca-france-data or the like.
Performance should be tested with large datasets even if I understand we are just dealing with the enum parts here.

And I would not use a bug bump since changing the types of enum can affect teh interaction with data preparation that may interact with enum. In the light of part migration problems, I would adpot a very conservative approach concerning versionning.

@bonjourmauko
Copy link
Member Author

@benjello Do you mean announcing it as major or minor? @benjello or @benoit-cty could you test on openfisca-france-data? I have no means of doing that by myself.

@bonjourmauko bonjourmauko changed the base branch from test/fix-enums-doctests to master October 14, 2024 10:14
@bonjourmauko
Copy link
Member Author

I marked it as a major, so we can move on with the merge. That gives us the time for any user experiencing difficulties to open issues that we can publish subsequently as patches/fixes.

@bonjourmauko bonjourmauko force-pushed the fix-mypy-checks-enums branch 6 times, most recently from 93f9bea to 0ca8e75 Compare October 14, 2024 11:13
BREAKING_CHANGE: This changeset has not breaking changes to the `indexed_enums` public API.
However, as a conservative measure concerning data preparation for large
population simulations, it has been marked as a major release.

Fixes #1271
Fixes #1267
Fixes #1233
@bonjourmauko bonjourmauko merged commit 66d0052 into master Oct 14, 2024
29 checks passed
@bonjourmauko bonjourmauko deleted the fix-mypy-checks-enums branch October 14, 2024 11:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind:docs Add or improvement of documentation level:intermediate Requires average OpenFisca experience
Projects
Development

Successfully merging this pull request may close these issues.

Add docs to enums Fix enums doctests Add typing to enums
4 participants