파이썬 Reading tab-delimited file with Pandas - works on Windows, but not on Mac

저는 Windows에서 Pandas / Python으로 탭으로 구분 된 데이터 파일을 문제없이 읽고 있습니다. 데이터 파일은 처음 세 줄에 메모를 포함하고 그 뒤에 헤더가 있습니다.

df = pd.read_csv(myfile,sep='\t',skiprows=(0,1,2),header=(0))

이제 Mac에서이 파일을 읽으려고합니다. (Mac에서 Python을 처음 사용했습니다.) 다음과 같은 오류가 발생합니다.

pandas.parser.CParserError: Error tokenizing data. C error: Expected 1
fields in line 8, saw 39

read_csv 에 대한 error_bad_lines 인수를 False 로 설정하면 다음 정보가 표시되며 마지막 행이 끝날 때까지 계속됩니다.

Skipping line 8: expected 1 fields, saw 39
Skipping line 9: expected 1 fields, saw 125
Skipping line 10: expected 1 fields, saw 125
Skipping line 11: expected 1 fields, saw 125
Skipping line 12: expected 1 fields, saw 125
Skipping line 13: expected 1 fields, saw 125
Skipping line 14: expected 1 fields, saw 125
Skipping line 15: expected 1 fields, saw 125
Skipping line 16: expected 1 fields, saw 125
Skipping line 17: expected 1 fields, saw 125
...

encoding 인수에 대한 값을 지정해야합니까? 파일 읽기가 Windows에서 잘 작동하기 때문에 필요하지 않은 것처럼 보입니다.

해결 방법

가장 큰 단서는 행이 모두 한 줄로 반환된다는 것입니다. 이는 줄 종결자가 무시되거나 존재하지 않음을 나타냅니다.

csv_reader에 대한 줄 종결자를 지정할 수 있습니다. Mac을 사용하는 경우 생성 된 줄은 Linux 표준 \ n 이 아닌 \ r 로 끝나거나 \가있는 Windows의 멜빵 및 벨트 접근 방식이 더 좋습니다. r \ n .

pandas.read_csv(filename, sep='\t', lineterminator='\r')

코덱 패키지를 사용하여 모든 데이터를 열 수도 있습니다. 이것은 문서 로딩 속도를 희생시키면서 견고성을 증가시킬 수 있습니다.

import codecs

doc = codecs.open('document','rU','UTF-16') #open for reading with "universal" type set

df = pandas.read_csv(doc, sep='\t')

참조 페이지 https://stackoverflow.com/questions/27896214

'파이썬' 카테고리의 다른 글

파이썬 Date difference in minutes in Python (0)	2020.12.02
파이썬 Python에서 scikit-learn kmeans를 사용하여 텍스트 문서 클러스터링 (0)	2020.12.02
파이썬 Pandas DataFrame의 선행 값으로 NaN을 대체하는 방법은 무엇입니까? (0)	2020.12.01
파이썬 하나의 파일에서 여러 JSON 객체를 추출하는 방법은 무엇입니까? (0)	2020.12.01
파이썬 nbviewer로 시각화 된 ipython 노트북의 셀에서 코드를 숨기는 방법은 무엇입니까? (0)	2020.12.01

프로그램 샘플 소스

파이썬 Reading tab-delimited file with Pandas - works on Windows, but not on Mac

해결 방법

'파이썬' 카테고리의 다른 글

댓글

티스토리툴바

파이썬 Reading tab-delimited file with Pandas - works on Windows, but not on Mac

해결 방법

'파이썬' 카테고리의 다른 글

관련글

댓글

티스토리툴바