파이썬에서 문자열을 바이너리로 변환

Development Tip

파이썬에서 문자열을 바이너리로 변환

yourdevel 2020. 10. 4. 13:36

파이썬에서 문자열을 바이너리로 변환

파이썬에서 문자열의 이진 표현을 얻는 방법이 필요합니다. 예 :

st = "hello world"
toBinary(st)

이를위한 깔끔한 방법의 모듈이 있습니까?

이 같은?

>>> st = "hello world"
>>> ' '.join(format(ord(x), 'b') for x in st)
'1101000 1100101 1101100 1101100 1101111 100000 1110111 1101111 1110010 1101100 1100100'

#using `bytearray`
>>> ' '.join(format(x, 'b') for x in bytearray(st))
'1101000 1100101 1101100 1101100 1101111 100000 1110111 1101111 1110010 1101100 1100100'

좀 더 파이썬적인 방법으로 먼저 문자열을 바이트 배열로 변환 한 다음 bin내에서 함수 를 사용할 수 있습니다 map.

>>> st = "hello world"
>>> map(bin,bytearray(st))
['0b1101000', '0b1100101', '0b1101100', '0b1101100', '0b1101111', '0b100000', '0b1110111', '0b1101111', '0b1110010', '0b1101100', '0b1100100']

또는 가입 할 수 있습니다.

>>> ' '.join(map(bin,bytearray(st)))
'0b1101000 0b1100101 0b1101100 0b1101100 0b1101111 0b100000 0b1110111 0b1101111 0b1110010 0b1101100 0b1100100'

에 있습니다 python3 당신에 대한 인코딩 지정해야합니다 bytearray기능 :

>>> ' '.join(map(bin,bytearray(st,'utf8')))
'0b1101000 0b1100101 0b1101100 0b1101100 0b1101111 0b100000 0b1110111 0b1101111 0b1110010 0b1101100 0b1100100'

binascii파이썬 2에서 모듈을 사용할 수도 있습니다 .

>>> import binascii
>>> bin(int(binascii.hexlify(st),16))
'0b110100001100101011011000110110001101111001000000111011101101111011100100110110001100100'

hexlify이진 데이터의 16 진수 표현을 반환 한 다음 16을 기본으로 지정하여 int로 변환 한 다음 bin.

인코딩 만하면됩니다.

'string'.encode('ascii')

ord()내장 함수를 사용하여 문자열의 문자에 대한 코드 값에 액세스 할 수 있습니다 . 그런 다음이를 바이너리로 포맷해야하는 경우 string.format()메서드가 작업을 수행합니다.

a = "test"
print(' '.join(format(ord(x), 'b') for x in a))

(코드 스 니펫을 게시 한 Ashwini Chaudhary에게 감사드립니다.)

위의 코드는 Python 3에서 작동하지만 UTF-8 이외의 인코딩을 가정하면이 문제가 더 복잡해집니다. Python 2에서 문자열은 바이트 시퀀스이며 기본적으로 ASCII 인코딩이 사용됩니다. Python 3에서 문자열은 유니 코드로 간주되며 bytesPython 2 문자열처럼 작동 하는 별도의 유형이 있습니다. UTF-8 이외의 인코딩을 가정하려면 인코딩을 지정해야합니다.

Python 3에서는 다음과 같이 할 수 있습니다.

a = "test"
a_bytes = bytes(a, "ascii")
print(' '.join(["{0:b}".format(x) for x in a_bytes]))

UTF-8과 ascii 인코딩의 차이점은 단순한 영숫자 문자열에서는 분명하지 않지만 ascii 문자 집합에없는 문자를 포함하는 텍스트를 처리하는 경우 중요합니다.

이것은 bytearray()더 이상 그렇게 작동하지 않는 기존 답변에 대한 업데이트입니다 .

>>> st = "hello world"
>>> map(bin, bytearray(st))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: string argument without an encoding

위 링크에서 설명한 것처럼 소스가 문자열 인 경우 인코딩도 제공해야하기 때문입니다 .

>>> map(bin, bytearray(st, encoding='utf-8'))
<map object at 0x7f14dfb1ff28>

def method_a(sample_string):
    binary = ' '.join(format(ord(x), 'b') for x in sample_string)

def method_b(sample_string):
    binary = ' '.join(map(bin,bytearray(sample_string,encoding='utf-8')))


if __name__ == '__main__':

    from timeit import timeit

    sample_string = 'Convert this ascii strong to binary.'

    print(
        timeit(f'method_a("{sample_string}")',setup='from __main__ import method_a'),
        timeit(f'method_b("{sample_string}")',setup='from __main__ import method_b')
    )

# 9.564299999998184 2.943955828988692

method_b는 모든 문자를 수동으로 정수로 변환 한 다음 해당 정수를 이진 값으로 변환하는 대신 저수준 함수 호출을 수행하기 때문에 바이트 배열로 변환하는 데 훨씬 효율적입니다.

Python 버전 3.6 이상에서는 'f-string'을 사용하여 결과 형식을 지정할 수 있습니다.

str = "hello world"
print(" ".join(f"{ord(i):08b}" for i in str))

01101000 01100101 01101100 01101100 01101111 00100000 01110111 01101111 01110010 01101100 01100100

The left side of the colon, ord(i), is the actual object whose value will be formatted and inserted into the output. Using ord() gives you the base-10 code point for a single str character.
The right hand side of the colon is the format specifier. 08 means width 8, 0 padded, and the b functions as a sign to output the resulting number in base 2 (binary).

a = list(input("Enter a string\t: "))
def fun(a):
    c =' '.join(['0'*(8-len(bin(ord(i))[2:]))+(bin(ord(i))[2:]) for i in a])
    return c
print(fun(a))

참고URL : https://stackoverflow.com/questions/18815820/convert-string-to-binary-in-python

'Development Tip' 카테고리의 다른 글

System.Double을 '0'(숫자, 정수?)과 비교하는 올바른 방법 (0)	2020.10.04
ng-model 동적 할당 (0)	2020.10.04
Spyder / IPython / matplotlib에서 대화 형 플롯을 다시 얻으려면 어떻게해야합니까? (0)	2020.10.04
ReferenceError를 던지는 Gulp-autoprefixer : Promise가 정의되지 않았습니다. (0)	2020.10.04
Node.js에서 path.normalize와 path.resolve의 차이점 (0)	2020.10.04

현재글파이썬에서 문자열을 바이너리로 변환

yourdevel

파이썬에서 문자열을 바이너리로 변환

파이썬에서 문자열을 바이너리로 변환

'Development Tip' 카테고리의 다른 글

'Development Tip'의 다른글

티스토리툴바

파이썬에서 문자열을 바이너리로 변환

파이썬에서 문자열을 바이너리로 변환

'Development Tip' 카테고리의 다른 글

'Development Tip'의 다른글

관련글

티스토리툴바