파이썬 아름다운 수프와 레를 사용하여 특정 텍스트를 포함하는 특정 클래스로 스팬을 찾는 방법은 무엇입니까?

형식의 텍스트를 포함하는 'blue' 클래스의 모든 범위를 어떻게 찾을 수 있습니까?

04/18/13 7:29pm

따라서 다음과 같을 수 있습니다.

04/18/13 7:29pm

또는:

Posted on 04/18/13 7:29pm

이를 수행하는 논리를 구성하는 측면에서 이것은 지금까지 얻은 것입니다.

new_content = original_content.find_all('span', {'class' : 'blue'}) # using beautiful soup's find_all
pattern = re.compile('<span class=\"blue\">[data in the format 04/18/13 7:29pm]</span>') # using re
for _ in new_content:
    result = re.findall(pattern, _)
    print result

수정 :

시나리오를 명확히하기 위해 다음과 같은 범위가 있습니다.

<span class="blue">here is a lot of text that i don't need</span>

과

<span class="blue">this is the span i need because it contains 04/18/13 7:29pm</span>

나머지 콘텐츠가 아닌 04/18/13 7:29 pm 만 필요합니다.

수정 2 :

나는 또한 시도했다 :

pattern = re.compile('<span class="blue">.*?(\d\d/\d\d/\d\d \d\d?:\d\d\w\w)</span>')
for _ in new_content:
    result = re.findall(pattern, _)
    print result

오류가 발생했습니다.

'TypeError: expected string or buffer'

해결 방법

import re
from bs4 import BeautifulSoup

html_doc = """
<html>
<body>
<span class="blue">here is a lot of text that i don't need</span>
<span class="blue">this is the span i need because it contains 04/18/13 7:29pm</span>
<span class="blue">04/19/13 7:30pm</span>
<span class="blue">Posted on 04/20/13 10:31pm</span>
</body>
</html>
"""

# parse the html
soup = BeautifulSoup(html_doc)

# find a list of all span elements
spans = soup.find_all('span', {'class' : 'blue'})

# create a list of lines corresponding to element texts
lines = [span.get_text() for span in spans]

# collect the dates from the list of lines using regex matching groups
found_dates = []
for line in lines:
    m = re.search(r'(\d{2}/\d{2}/\d{2} \d+:\d+[a|p]m)', line)
    if m:
        found_dates.append(m.group(1))

# print the dates we collected
for date in found_dates:
    print(date)

산출:

04/18/13 7:29pm
04/19/13 7:30pm
04/20/13 10:31pm

참조 페이지 https://stackoverflow.com/questions/16248723

'파이썬' 카테고리의 다른 글

파이썬 배열에서 낮은 값을 0으로 만드는 가장 빠른 방법은 무엇입니까? (0)	2021.01.18
파이썬 기존 값보다 큰 값의 첫 번째 Numpy (0)	2021.01.18
파이썬 목록을 n 그룹으로 분할하는 다른 방법 (0)	2021.01.18
파이썬 mongodb에서 pandas로 데이터를 가져 오는 방법은 무엇입니까? (0)	2021.01.18
파이썬 Python을 사용하여 국가 이름을 ISO 3166-1 alpha-2 값으로 변환하는 방법 (0)	2021.01.18

프로그램 샘플 소스

파이썬 아름다운 수프와 레를 사용하여 특정 텍스트를 포함하는 특정 클래스로 스팬을 찾는 방법은 무엇입니까?

해결 방법

'파이썬' 카테고리의 다른 글

댓글

티스토리툴바

파이썬 아름다운 수프와 레를 사용하여 특정 텍스트를 포함하는 특정 클래스로 스팬을 찾는 방법은 무엇입니까?

해결 방법

'파이썬' 카테고리의 다른 글

관련글

댓글

티스토리툴바