Python) Numba 예제 (TODO)

예전 Numba 관련 글

https://data-newbie.tistory.com/390

EX) Montecarlo Method

import random
from numba import jit

@jit(nopython=True)
def monte_carlo_pi(nsamples):
    acc = 0
    for i in range(nsamples):
        x = random.random()
        y = random.random()
        if (x ** 2 + y ** 2) < 1.0:
            acc += 1
    return 4.0 * acc / nsamples

def monte_carlo_pi_no_numba(nsamples):
    acc = 0
    for i in range(nsamples):
        x = random.random()
        y = random.random()
        if (x ** 2 + y ** 2) < 1.0:
            acc += 1
    return 4.0 * acc / nsamples
    
%time monte_carlo_pi_no_numba(10000)
#CPU times: user 5.53 ms, sys: 808 µs, total: 6.34 ms
#Wall time: 6.07 ms

%time monte_carlo_pi(10000)
%time monte_carlo_pi(10000)
#CPU times: user 329 ms, sys: 40 ms, total: 369 ms
#Wall time: 451 ms
#CPU times: user 61 µs, sys: 45 µs, total: 106 µs
#Wall time: 109 µs

성능차이가 나는 것을 알 수 있다.

잘못된 경우

def original_function(input_list):
    output_list = []
    for item in input_list:
        if item % 2 == 0:
            output_list.append(2)
        else:
            output_list.append('1')
    return output_list

test_list = list(range(100000))
original_function(test_list)[0:10]

아래 코드를 사용하면 에러가 발생한다.

jitted_function = jit()(original_function)
jitted_function(test_list)[0:10]

<ipython-input-5-bff7290ad327>:1: NumbaWarning: 
Compilation is falling back to object mode WITH looplifting enabled because Function "original_function" failed type inference due to: Invalid use of BoundFunction(list.append for list(int64)<iv=None>) with parameters (Literal[str](1))

During: resolving callee type: BoundFunction(list.append for list(int64)<iv=None>)
During: typing of call at <ipython-input-5-bff7290ad327> (7)


File "<ipython-input-5-bff7290ad327>", line 7:
def original_function(input_list):
    <source elided>
        else:
            output_list.append('1')
            ^

  def original_function(input_list):
<ipython-input-5-bff7290ad327>:1: NumbaWarning: 
Compilation is falling back to object mode WITHOUT looplifting enabled because Function "original_function" failed type inference due to: Cannot determine Numba type of <class 'numba.core.dispatcher.LiftedLoop'>

File "<ipython-input-5-bff7290ad327>", line 3:
def original_function(input_list):
    <source elided>
    output_list = []
    for item in input_list:
    ^
show more (open the raw output data in a text editor) ...

    output_list = []
    for item in input_list:
    ^

  warnings.warn(errors.NumbaDeprecationWarning(msg,
[2, '1', 2, '1', 2, '1', 2, '1', 2, '1']

리스트안에 타입이 일치할 필요가 있다. 실제로 작동은 하지만, 속도 개선은 없는 것을 알 수 있다.
해당 결과에서는 더 느리게 나온 것을 알 수 있다.

%time _ = original_function(test_list)
CPU times: user 18.9 ms, sys: 0 ns, total: 18.9 ms
Wall time: 18.7 ms

%time _ = jitted_function(test_list)
CPU times: user 29.9 ms, sys: 734 µs, total: 30.6 ms
Wall time: 30.4 ms

이렇게 작동하지 않는 경우에도 작동하게 해놔서 numba를 쓰게는 할 수 있지만, 속도 개선은 안된다는거!

njitted_function = njit()(original_function)
njitted_function(test_list)[0:5]
TypingError: Failed in nopython mode pipeline (step: nopython frontend)
[1m[1m[1mInvalid use of BoundFunction(list.append for list(int64)<iv=None>) with parameters (Literal[str](1))
[0m
[0m[1mDuring: resolving callee type: BoundFunction(list.append for list(int64)<iv=None>)[0m
[0m[1mDuring: typing of call at <ipython-input-5-bff7290ad327> (7)
[0m
[1m
File "<ipython-input-5-bff7290ad327>", line 7:[0m
[1mdef original_function(input_list):
    <source elided>
        else:
[1m            output_list.append('1')
[0m            [1m^[0m[0m
---------------------------------------------------------------------------
TypingError                               Traceback (most recent call last)
<ipython-input-12-61341abb3f42> in <module>
      1 njitted_function = njit()(original_function)
----> 2 njitted_function(test_list)[0:5]

/opt/conda/lib/python3.8/site-packages/numba/core/dispatcher.py in _compile_for_args(self, *args, **kws)
    412                 e.patch_message(msg)
    413 
--> 414             error_rewrite(e, 'typing')
    415         except errors.UnsupportedError as e:
    416             # Something unsupported is present in the user code, add help info

/opt/conda/lib/python3.8/site-packages/numba/core/dispatcher.py in error_rewrite(e, issue_type)
    355                 raise e
    356             else:
--> 357                 raise e.with_traceback(None)
    358 
    359         argtypes = []

TypingError: Failed in nopython mode pipeline (step: nopython frontend)
Invalid use of BoundFunction(list.append for list(int64)<iv=None>) with parameters (Literal[str](1))

During: resolving callee type: BoundFunction(list.append for list(int64)<iv=None>)
During: typing of call at <ipython-input-5-bff7290ad327> (7)


File "<ipython-input-5-bff7290ad327>", line 7:
def original_function(input_list):
    <source elided>
        else:
            output_list.append('1')
            ^

list에다가 numba를 쓰면 더 느려진다는 것을 알 수 있다.

list를 np.array로 바꿨을 뿐인데 ,성능 향상됨.

타입이 2개가 다른 것도 지원은 하나 속도는 여전히 느리다. 심지어(2,1.5)이것도 (int,float)이라서 속도에 영향을 줄 수 있음.

Creating and returning lists from JIT-compiled functions is supported, as well as all methods and operations. Lists must be strictly homogeneous: Numba will reject any list containing objects of different types, even if the types are compatible (for example, [1, 2.5] is rejected as it contains a int and a float).
# https://numba.pydata.org/numba-doc/dev/reference/pysupported.html#list

def sane_function(input_list):
    output_list = []
    for item in input_list:
        if item % 2 == 0:
            output_list.append(2)
        else:
            output_list.append(1)
    return output_list

test_list = list(range(100000))
%time sane_function(test_list)[0:5]
CPU times: user 15.8 ms, sys: 0 ns, total: 15.8 ms
Wall time: 15.6 ms
[2, 1, 2, 1, 2]

njitted_sane_function = njit()(sane_function)
%time njitted_sane_function(test_list)[0:5]
/opt/conda/lib/python3.8/site-packages/numba/core/ir_utils.py:2067: NumbaPendingDeprecationWarning: 
Encountered the use of a type that is scheduled for deprecation: type 'reflected list' found for argument 'input_list' of function 'sane_function'.

For more information visit https://numba.pydata.org/numba-doc/latest/reference/deprecation.html#deprecation-of-reflection-for-list-and-set-types

File "<ipython-input-13-9a7e18fa2d25>", line 1:
def sane_function(input_list):
^

  warnings.warn(NumbaPendingDeprecationWarning(msg, loc=loc))
CPU times: user 290 ms, sys: 3.88 ms, total: 293 ms
Wall time: 291 ms
[2, 1, 2, 1, 2]

import numpy as np
test_list = np.arange(100000)
%time njitted_sane_function(test_list)[0:5]
#CPU times: user 2.33 ms, sys: 0 ns, total: 2.33 ms
#Wall time: 7.9 ms
#[2, 1, 2, 1, 2]

vectorize

아래 결과 똑같은 함수도 처음할 때랑 두번째랑 다름을 알 수 있다.

첫번째 호출에서 함수는 실제로 컴파일되고 있으므로 더 오래걸림

두번째 호출에서는 최종적으로 얻을 수 있는 극도의 속도 향상을 볼 수 있습니다.

이는 적절한 크기의 출력 목록이 미리 할당되도록 하므로 목록이 알 수 없는 크기로 커지던 과거 형태의 함수에 대한 최적화입니다. . 출력 배열을 먼저 할당하여 원래 함수에서 이 문제를 해결할 수 있습니다.

@vectorize(nopython=True)
def non_list_function(item):
    if item % 2 == 0:
        return 2
    else:
        return 1
%time non_list_function(test_list)

CPU times: user 68.9 ms, sys: 3.9 ms, total: 72.8 ms
Wall time: 72.2 ms
array([2, 1, 2, ..., 1, 2, 1])

%time non_list_function(test_list)
CPU times: user 0 ns, sys: 539 µs, total: 539 µs
Wall time: 309 µs
array([2, 1, 2, ..., 1, 2, 1])

예제) spring mass

from IPython.display import Image

Image('https://upload.wikimedia.org/wikipedia/commons/f/fa/Spring-mass_under-damped.gif')

# Let's mix wet friction with dry friction, this makes the behavior
# of the system dependent on the initial condition, something
# may be interesting to study by running an exhaustive simluation

def friction_fn(v, vt):
    if v > vt:
        return - v * 3
    else:
        return - vt * 3 * np.sign(v)

def simulate_spring_mass_funky_damper(x0, T=10, dt=0.0001, vt=1.0):
    times = np.arange(0, T, dt)
    positions = np.zeros_like(times)
    
    v = 0
    a = 0
    x = x0
    positions[0] = x0/x0
    
    for ii in range(len(times)):
        if ii == 0:
            continue
        t = times[ii]
        a = friction_fn(v, vt) - 100*x
        v = v + a*dt
        x = x + v*dt
        positions[ii] = x/x0
    return times, positions
    
import matplotlib.pyplot as plt
plt.plot(*simulate_spring_mass_funky_damper(0.1))
plt.plot(*simulate_spring_mass_funky_damper(1))
plt.plot(*simulate_spring_mass_funky_damper(10))
plt.legend(['0.1', '1', '10'])

%time _ = simulate_spring_mass_funky_damper(0.1)
CPU times: user 267 ms, sys: 2.78 ms, total: 269 ms
Wall time: 268 ms

@njit
def friction_fn(v, vt):
    if v > vt:
        return - v * 3
    else:
        return - vt * 3 * np.sign(v)

@njit
def simulate_spring_mass_funky_damper(x0, T=10, dt=0.0001, vt=1.0):
    times = np.arange(0, T, dt)
    positions = np.zeros_like(times)
    
    v = 0
    a = 0
    x = x0
    positions[0] = x0/x0
    
    for ii in range(len(times)):
        if ii == 0:
            continue
        t = times[ii]
        a = friction_fn(v, vt) - 100*x
        v = v + a*dt
        x = x + v*dt
        positions[ii] = x/x0
    return times, positions

_ = simulate_spring_mass_funky_damper(0.1)


%time _ = simulate_spring_mass_funky_damper(0.1)
CPU times: user 931 µs, sys: 209 µs, total: 1.14 ms
Wall time: 1.16 ms

기존 코드에서 jit만 붙였을 뿐인데, 속도가 거의 200배가 차이나게 나온다..

흠, 그것은 실제로 더 빨라 보이지 않고 htop을 보면 모든 코어가 사용되는 것처럼 보이지 않는다고 합니다ㅣ.

이는 기본적으로 Numba 함수가 전역 인터프리터 잠금(GIL)을 해제하지 않기 때문이라고 합니다.

GIL(NOGIL)

%%time
from concurrent.futures import ThreadPoolExecutor

with ThreadPoolExecutor(8) as ex:
    ex.map(simulate_spring_mass_funky_damper, np.arange(0, 1000, 0.1))
    
CPU times: user 10.1 s, sys: 173 ms, total: 10.2 s
Wall time: 2.52 s

Parallel

Numba는 실제로 기본적으로 코드를 다중 처리할 수 있지만 이 경우 래퍼 함수를 정의해야 합니다.

from numba import prange
@njit(nogil=True, parallel=True)
def run_sims(end=1000):
    for x0 in prange(int(end/0.1)):
        if x0 == 0:
            continue
        simulate_spring_mass_funky_damper(x0*0.1)
        
run_sims()

CPU times: user 10.5 s, sys: 12.4 ms, total: 10.5 s
Wall time: 271 ms

우연히 보게 됬는데, 연습해야할 것 같아서 일단 올림

https://www.youtube.com/watch?v=x58W9A2lnQc&ab_channel=JackofSome

https://gist.github.com/safijari/fa4eba922cea19b3bc6a693fe2a97af7

numba_absolute_minimum.ipynb

GitHub Gist: instantly share code, notes, and snippets.

gist.github.com

저작자표시 (새창열림)

'꿀팁 분석 환경 설정 > Python' 카테고리의 다른 글

vscode) line length 늘리기 (0)	2021.09.05
Python) Database 관련 자료 (0)	2021.09.04
Python) 파이썬 프로젝트를 패키지화하기(setup.py) (2)	2021.07.27
디버깅) 파이썬 코드 실행 시각화 또는 추적을 하는 3가지 도구 (1)	2021.05.08
디버깅) CyberBrain (0)	2021.05.08

Python) Numba 예제 (TODO)

예전 Numba 관련 글

EX) Montecarlo Method

잘못된 경우

vectorize

예제) spring mass

GIL(NOGIL)

Parallel

'꿀팁 분석 환경 설정 > Python' 카테고리의 다른 글

AI 도구

AI 도구 사이드 패널

티스토리툴바