Compare commits
173 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
d98d34d8b3 | ||
|
|
24fa104e84 | ||
|
|
b4dad8c641 | ||
|
|
3550cd6d91 | ||
|
|
2815b48e0e | ||
|
|
650e6ccb65 | ||
|
|
4a00a19a43 | ||
|
|
b067eda7b6 | ||
|
|
1b6bc86e76 | ||
|
|
da2b513bcc | ||
|
|
6adae578ef | ||
|
|
128a834841 | ||
|
|
086a14115f | ||
|
|
6a392f3e1a | ||
|
|
93127a703c | ||
|
|
e4ddbaf8ae | ||
|
|
ec75058605 | ||
|
|
2b62e5dc5e | ||
|
|
8d7874096e | ||
|
|
99fcab83c8 | ||
|
|
3027bc0579 | ||
|
|
b1b70a4e76 | ||
|
|
de41341d84 | ||
|
|
a03d43b081 | ||
|
|
f60aaade7f | ||
|
|
d3c34086ff | ||
|
|
6b58c9bcf5 | ||
|
|
c2cba1651e | ||
|
|
ada3eb437d | ||
|
|
c1517d5be8 | ||
|
|
351034d1e6 | ||
|
|
c1db5a0c47 | ||
|
|
088dce712a | ||
|
|
425e880b09 | ||
|
|
62ec78abee | ||
|
|
c84a32682c | ||
|
|
74277b2afe | ||
|
|
cd20b74b2a | ||
|
|
06f54fd985 | ||
|
|
98b0470703 | ||
|
|
bb4113b53c | ||
|
|
07f4382ed4 | ||
|
|
d40720616b | ||
|
|
eebe7c79bd | ||
|
|
6c9e327e36 | ||
|
|
e9161c0ddd | ||
|
|
c8b75dcf0e | ||
|
|
30cb7d7043 | ||
|
|
19d5b74beb | ||
|
|
d5c3e45edc | ||
|
|
1d479fc15c | ||
|
|
20a20ddd08 | ||
|
|
00c239f974 | ||
|
|
67b766b32c | ||
|
|
249aa0d147 | ||
|
|
c708a588d8 | ||
|
|
cb15df525f | ||
|
|
fcddc1516b | ||
|
|
a7732efd07 | ||
|
|
0a2f4e8418 | ||
|
|
0c0ba0dfe6 | ||
|
|
02827b174e | ||
|
|
81dee8a218 | ||
|
|
5eb8bdbd0e | ||
|
|
a37602e666 | ||
|
|
306b69198e | ||
|
|
175e457052 | ||
|
|
5633a48618 | ||
|
|
d7e608e8a1 | ||
|
|
213427fab3 | ||
|
|
3427c6fb69 | ||
|
|
603c4470b7 | ||
|
|
37c8b7ae45 | ||
|
|
d362152c77 | ||
|
|
8f5c3f312a | ||
|
|
15a1d5c210 | ||
|
|
499cf26fa8 | ||
|
|
90596be880 | ||
|
|
50d7b097e6 | ||
|
|
b8d5ec5465 | ||
|
|
3200c5654f | ||
|
|
4905b1e4d8 | ||
|
|
16df63c14e | ||
|
|
e950dff9d2 | ||
|
|
39d99ad4af | ||
|
|
3675c91240 | ||
|
|
46258f625a | ||
|
|
2cc161b589 | ||
|
|
115277e5e1 | ||
|
|
ebf0e7c181 | ||
|
|
b418898eef | ||
|
|
3106b3e545 | ||
|
|
50816a661d | ||
|
|
6755bc8bb2 | ||
|
|
d62e7730ab | ||
|
|
26be989b9b | ||
|
|
73ad0a1f44 | ||
|
|
66b185ebf7 | ||
|
|
8bd82713e2 | ||
|
|
71650c39f7 | ||
|
|
488445c73b | ||
|
|
075e811efe | ||
|
|
9f9b83f185 | ||
|
|
58d9bf7fdb | ||
|
|
b3e6275de7 | ||
|
|
748778f545 | ||
|
|
b2a68d0a74 | ||
|
|
e29b3b8377 | ||
|
|
0859ed5fb1 | ||
|
|
a80d5ba080 | ||
|
|
ac2924824e | ||
|
|
b7e6043a71 | ||
|
|
820ba35013 | ||
|
|
ecd2d130bf | ||
|
|
1d410b6e68 | ||
|
|
f77a2c889b | ||
|
|
47d5ab288f | ||
|
|
5f53fd24dd | ||
|
|
11a9d0e2d7 | ||
|
|
6f18de46f7 | ||
|
|
480c9e15b8 | ||
|
|
35aa7636f6 | ||
|
|
8fee67c2d4 | ||
|
|
74bfdd07e2 | ||
|
|
d3f1643a40 | ||
|
|
eb29f27493 | ||
|
|
8adf75ab83 | ||
|
|
2e05803d75 | ||
|
|
f16c0ee73a | ||
|
|
a338f2b782 | ||
|
|
864ccddfd7 | ||
|
|
339df69e36 | ||
|
|
76a5b0cd18 | ||
|
|
be0ab2431b | ||
|
|
2edb60c592 | ||
|
|
2c6c3a1ca3 | ||
|
|
4be540793d | ||
|
|
08b86fe596 | ||
|
|
157f3b9952 | ||
|
|
8f3ca2662a | ||
|
|
c4b015861c | ||
|
|
3aa413d59e | ||
|
|
03ba285a16 | ||
|
|
5fe0ee5aa8 | ||
|
|
4e829a25d4 | ||
|
|
15132a9bb8 | ||
|
|
64ace9dad6 | ||
|
|
9a2e96d3a0 | ||
|
|
a3695a59b8 | ||
|
|
bc8655ed62 | ||
|
|
3bdc465740 | ||
|
|
235d6b7212 | ||
|
|
9f0754da57 | ||
|
|
306b0a4564 | ||
|
|
1c49387f1a | ||
|
|
300d96e56c | ||
|
|
0e301f48a8 | ||
|
|
a790ab13a9 | ||
|
|
0456300d19 | ||
|
|
2ef1e7028f | ||
|
|
9413c4a186 | ||
|
|
8a8cef399f | ||
|
|
3bcad12cf6 | ||
|
|
4eb18279fe | ||
|
|
e9ed564e1b | ||
|
|
95f975c93d | ||
|
|
8012e1d191 | ||
|
|
404727c49c | ||
|
|
5f78a99507 | ||
|
|
497d84015e | ||
|
|
5d3c7b5abd | ||
|
|
f16ef60f11 | ||
|
|
5fa4d051ee |
27
.github/workflows/run_test.yml
vendored
Normal file
27
.github/workflows/run_test.yml
vendored
Normal file
@@ -0,0 +1,27 @@
|
||||
name: Run All UnitTest
|
||||
|
||||
on: [push, pull_request]
|
||||
|
||||
jobs:
|
||||
build:
|
||||
|
||||
runs-on: ubuntu-latest
|
||||
strategy:
|
||||
max-parallel: 4
|
||||
matrix:
|
||||
python-version: [3.7, 3.8]
|
||||
|
||||
steps:
|
||||
- uses: actions/checkout@v2
|
||||
- name: Set up Python ${{ matrix.python-version }}
|
||||
uses: actions/setup-python@v2
|
||||
with:
|
||||
python-version: ${{ matrix.python-version }}
|
||||
- name: Install dependencies
|
||||
run: |
|
||||
python -m pip install --upgrade pip
|
||||
pip install -r requirements.txt -r requirements_test.txt
|
||||
- name: Test with pytest
|
||||
run: |
|
||||
export PYTHONPATH=./
|
||||
pytest --verbose --color=yes
|
||||
163
README.md
163
README.md
@@ -5,9 +5,7 @@ pytchat is a python library for fetching youtube live chat.
|
||||
|
||||
## Description
|
||||
pytchat is a python library for fetching youtube live chat
|
||||
without using youtube api, Selenium or BeautifulSoup.
|
||||
|
||||
pytchatは、YouTubeチャットを閲覧するためのpythonライブラリです。
|
||||
without using Selenium or BeautifulSoup.
|
||||
|
||||
Other features:
|
||||
+ Customizable [chat data processors](https://github.com/taizan-hokuto/pytchat/wiki/ChatProcessor) including youtube api compatible one.
|
||||
@@ -16,7 +14,7 @@ Other features:
|
||||
instead of web scraping.
|
||||
|
||||
For more detailed information, see [wiki](https://github.com/taizan-hokuto/pytchat/wiki). <br>
|
||||
より詳細な解説は[wiki](https://github.com/taizan-hokuto/pytchat/wiki/Home_jp)を参照してください。
|
||||
[wiki (Japanese)](https://github.com/taizan-hokuto/pytchat/wiki/Home_jp)
|
||||
|
||||
## Install
|
||||
```python
|
||||
@@ -26,145 +24,61 @@ pip install pytchat
|
||||
|
||||
### CLI
|
||||
|
||||
One-liner command.
|
||||
Save chat data to html, with embedded custom emojis.
|
||||
+ One-liner command.
|
||||
|
||||
+ Save chat data to html with embedded custom emojis.
|
||||
|
||||
+ Show chat stream (--echo option).
|
||||
|
||||
```bash
|
||||
$ pytchat -v https://www.youtube.com/watch?v=ZJ6Q4U_Vg6s -o "c:/temp/"
|
||||
$ pytchat -v uIx8l2xlYVY -o "c:/temp/"
|
||||
# options:
|
||||
# -v : Video ID or URL that includes ID
|
||||
# -o : output directory (default path: './')
|
||||
# --echo : Show chats.
|
||||
# saved filename is [video_id].html
|
||||
```
|
||||
|
||||
|
||||
### on-demand mode
|
||||
### Fetch chat data (see [wiki](https://github.com/taizan-hokuto/pytchat/wiki/PytchatCore))
|
||||
```python
|
||||
from pytchat import LiveChat
|
||||
livechat = LiveChat(video_id = "Zvp1pJpie4I")
|
||||
# It is also possible to specify a URL that includes the video ID:
|
||||
# livechat = LiveChat("https://www.youtube.com/watch?v=Zvp1pJpie4I")
|
||||
while livechat.is_alive():
|
||||
try:
|
||||
chatdata = livechat.get()
|
||||
for c in chatdata.items:
|
||||
print(f"{c.datetime} [{c.author.name}]- {c.message}")
|
||||
chatdata.tick()
|
||||
except KeyboardInterrupt:
|
||||
livechat.terminate()
|
||||
break
|
||||
```
|
||||
|
||||
### callback mode
|
||||
```python
|
||||
from pytchat import LiveChat
|
||||
import time
|
||||
|
||||
def main():
|
||||
livechat = LiveChat(video_id = "Zvp1pJpie4I", callback = disp)
|
||||
while livechat.is_alive():
|
||||
#other background operation.
|
||||
time.sleep(1)
|
||||
livechat.terminate()
|
||||
|
||||
#callback function (automatically called)
|
||||
def disp(chatdata):
|
||||
for c in chatdata.items:
|
||||
print(f"{c.datetime} [{c.author.name}]- {c.message}")
|
||||
chatdata.tick()
|
||||
|
||||
if __name__ == '__main__':
|
||||
main()
|
||||
|
||||
```
|
||||
|
||||
### asyncio context:
|
||||
```python
|
||||
from pytchat import LiveChatAsync
|
||||
from concurrent.futures import CancelledError
|
||||
import asyncio
|
||||
|
||||
async def main():
|
||||
livechat = LiveChatAsync("Zvp1pJpie4I", callback = func)
|
||||
while livechat.is_alive():
|
||||
#other background operation.
|
||||
await asyncio.sleep(3)
|
||||
|
||||
#callback function is automatically called.
|
||||
async def func(chatdata):
|
||||
for c in chatdata.items:
|
||||
print(f"{c.datetime} [{c.author.name}]-{c.message} {c.amountString}")
|
||||
await chatdata.tick_async()
|
||||
|
||||
if __name__ == '__main__':
|
||||
try:
|
||||
loop = asyncio.get_event_loop()
|
||||
loop.run_until_complete(main())
|
||||
except CancelledError:
|
||||
pass
|
||||
```
|
||||
|
||||
|
||||
### youtube api compatible processor:
|
||||
```python
|
||||
from pytchat import LiveChat, CompatibleProcessor
|
||||
import time
|
||||
|
||||
chat = LiveChat("Zvp1pJpie4I",
|
||||
processor = CompatibleProcessor() )
|
||||
|
||||
import pytchat
|
||||
chat = pytchat.create(video_id="uIx8l2xlYVY")
|
||||
while chat.is_alive():
|
||||
try:
|
||||
data = chat.get()
|
||||
polling = data['pollingIntervalMillis']/1000
|
||||
for c in data['items']:
|
||||
if c.get('snippet'):
|
||||
print(f"[{c['authorDetails']['displayName']}]"
|
||||
f"-{c['snippet']['displayMessage']}")
|
||||
time.sleep(polling/len(data['items']))
|
||||
except KeyboardInterrupt:
|
||||
chat.terminate()
|
||||
for c in chat.get().sync_items():
|
||||
print(f"{c.datetime} [{c.author.name}]- {c.message}")
|
||||
```
|
||||
### replay:
|
||||
If specified video is not live,
|
||||
automatically try to fetch archived chat data.
|
||||
|
||||
|
||||
### Output JSON format string (feature of [DefaultProcessor](https://github.com/taizan-hokuto/pytchat/wiki/DefaultProcessor))
|
||||
```python
|
||||
from pytchat import LiveChat
|
||||
import pytchat
|
||||
import time
|
||||
|
||||
def main():
|
||||
#seektime (seconds): start position of chat.
|
||||
chat = LiveChat("ojes5ULOqhc", seektime = 60*30)
|
||||
print('Replay from 30:00')
|
||||
try:
|
||||
while chat.is_alive():
|
||||
data = chat.get()
|
||||
for c in data.items:
|
||||
print(f"{c.elapsedTime} [{c.author.name}]-{c.message} {c.amountString}")
|
||||
data.tick()
|
||||
except KeyboardInterrupt:
|
||||
chat.terminate()
|
||||
|
||||
if __name__ == '__main__':
|
||||
main()
|
||||
chat = pytchat.create(video_id="uIx8l2xlYVY")
|
||||
while chat.is_alive():
|
||||
print(chat.get().json())
|
||||
time.sleep(5)
|
||||
'''
|
||||
# Each chat item can also be output in JSON format.
|
||||
for c in chat.get().items:
|
||||
print(c.json())
|
||||
'''
|
||||
```
|
||||
### Extract archived chat data as [HTML](https://github.com/taizan-hokuto/pytchat/wiki/HTMLArchiver) or [tab separated values](https://github.com/taizan-hokuto/pytchat/wiki/TSVArchiver).
|
||||
```python
|
||||
from pytchat import HTMLArchiver, Extractor
|
||||
|
||||
video_id = "*******"
|
||||
ex = Extractor(
|
||||
video_id,
|
||||
div=10,
|
||||
processor=HTMLArchiver("c:/test.html")
|
||||
)
|
||||
|
||||
ex.extract()
|
||||
print("finished.")
|
||||
```
|
||||
### other
|
||||
+ Fetch chat with a buffer ([LiveChat](https://github.com/taizan-hokuto/pytchat/wiki/LiveChat))
|
||||
|
||||
+ Use with asyncio ([LiveChatAsync](https://github.com/taizan-hokuto/pytchat/wiki/LiveChatAsync))
|
||||
|
||||
+ YT API compatible chat processor ([CompatibleProcessor](https://github.com/taizan-hokuto/pytchat/wiki/CompatibleProcessor))
|
||||
|
||||
+ Extract archived chat data ([Extractor](https://github.com/taizan-hokuto/pytchat/wiki/Extractor))
|
||||
|
||||
|
||||
## Structure of Default Processor
|
||||
Each item can be got with `items` function.
|
||||
Each item can be got with `sync_items()` function.
|
||||
<table>
|
||||
<tr>
|
||||
<th>name</th>
|
||||
@@ -298,6 +212,9 @@ Most of source code of CLI refer to:
|
||||
|
||||
[PetterKraabol / Twitch-Chat-Downloader](https://github.com/PetterKraabol/Twitch-Chat-Downloader)
|
||||
|
||||
Progress bar in CLI is based on:
|
||||
|
||||
[vladignatyev/progress.py](https://gist.github.com/vladignatyev/06860ec2040cb497f0f3)
|
||||
|
||||
## Author
|
||||
|
||||
|
||||
@@ -1,14 +1,29 @@
|
||||
"""
|
||||
pytchat is a lightweight python library to browse youtube livechat without Selenium or BeautifulSoup.
|
||||
"""
|
||||
__copyright__ = 'Copyright (C) 2019 taizan-hokuto'
|
||||
__version__ = '0.1.4'
|
||||
__copyright__ = 'Copyright (C) 2019, 2020 taizan-hokuto'
|
||||
__version__ = '0.4.5'
|
||||
__license__ = 'MIT'
|
||||
__author__ = 'taizan-hokuto'
|
||||
__author_email__ = '55448286+taizan-hokuto@users.noreply.github.com'
|
||||
__url__ = 'https://github.com/taizan-hokuto/pytchat'
|
||||
|
||||
__all__ = ["core_async","core_multithread","processors"]
|
||||
|
||||
from .exceptions import (
|
||||
ChatParseException,
|
||||
ResponseContextError,
|
||||
NoContents,
|
||||
NoContinuation,
|
||||
IllegalFunctionCall,
|
||||
InvalidVideoIdException,
|
||||
UnknownConnectionError,
|
||||
RetryExceedMaxCount,
|
||||
ChatDataFinished,
|
||||
ReceivedUnknownContinuation,
|
||||
FailedExtractContinuation,
|
||||
VideoInfoParseError,
|
||||
PatternUnmatchError
|
||||
)
|
||||
|
||||
from .api import (
|
||||
cli,
|
||||
@@ -26,7 +41,7 @@ from .api import (
|
||||
SimpleDisplayProcessor,
|
||||
SpeedCalculator,
|
||||
SuperchatCalculator,
|
||||
VideoInfo
|
||||
VideoInfo,
|
||||
create
|
||||
)
|
||||
|
||||
# flake8: noqa
|
||||
@@ -1,5 +1,6 @@
|
||||
from . import cli
|
||||
from . import config
|
||||
from .core import create
|
||||
from .core_multithread.livechat import LiveChat
|
||||
from .core_async.livechat import LiveChatAsync
|
||||
from .processors.chat_processor import ChatProcessor
|
||||
@@ -15,4 +16,24 @@ from .processors.superchat.calculator import SuperchatCalculator
|
||||
from .tool.extract.extractor import Extractor
|
||||
from .tool.videoinfo import VideoInfo
|
||||
|
||||
__all__ = [
|
||||
cli,
|
||||
config,
|
||||
LiveChat,
|
||||
LiveChatAsync,
|
||||
ChatProcessor,
|
||||
CompatibleProcessor,
|
||||
DummyProcessor,
|
||||
DefaultProcessor,
|
||||
Extractor,
|
||||
HTMLArchiver,
|
||||
TSVArchiver,
|
||||
JsonfileArchiver,
|
||||
SimpleDisplayProcessor,
|
||||
SpeedCalculator,
|
||||
SuperchatCalculator,
|
||||
VideoInfo,
|
||||
create
|
||||
]
|
||||
|
||||
# flake8: noqa
|
||||
@@ -1,20 +1,21 @@
|
||||
import argparse
|
||||
try:
|
||||
from asyncio import CancelledError
|
||||
except ImportError:
|
||||
from asyncio.futures import CancelledError
|
||||
import os
|
||||
from pathlib import Path
|
||||
from pytchat.util.extract_video_id import extract_video_id
|
||||
from .arguments import Arguments
|
||||
from .. exceptions import InvalidVideoIdException, NoContents, VideoInfoParseException
|
||||
from .. processors.html_archiver import HTMLArchiver
|
||||
from .. tool.extract.extractor import Extractor
|
||||
from .. tool.videoinfo import VideoInfo
|
||||
from .echo import Echo
|
||||
from .. exceptions import InvalidVideoIdException
|
||||
from .. import __version__
|
||||
from .cli_extractor import CLIExtractor
|
||||
|
||||
|
||||
'''
|
||||
Most of CLI modules refer to
|
||||
Petter Kraabøl's Twitch-Chat-Downloader
|
||||
https://github.com/PetterKraabol/Twitch-Chat-Downloader
|
||||
(MIT License)
|
||||
|
||||
'''
|
||||
|
||||
|
||||
@@ -27,48 +28,44 @@ def main():
|
||||
'If ID starts with a hyphen (-), enclose the ID in square brackets.')
|
||||
parser.add_argument('-o', f'--{Arguments.Name.OUTPUT}', type=str,
|
||||
help='Output directory (end with "/"). default="./"', default='./')
|
||||
parser.add_argument(f'--{Arguments.Name.DEBUG}', action='store_true',
|
||||
help='Debug mode. Stop when exceptions have occurred and save error data (".dat" file).')
|
||||
parser.add_argument(f'--{Arguments.Name.VERSION}', action='store_true',
|
||||
help='Show version')
|
||||
help='Show version.')
|
||||
parser.add_argument(f'--{Arguments.Name.ECHO}', action='store_true',
|
||||
help='Display chats of specified video.')
|
||||
|
||||
Arguments(parser.parse_args().__dict__)
|
||||
|
||||
if Arguments().print_version:
|
||||
print(f'pytchat v{__version__} © 2019 taizan-hokuto')
|
||||
print(f'pytchat v{__version__} © 2019, 2020 taizan-hokuto')
|
||||
return
|
||||
|
||||
if not Arguments().video_ids:
|
||||
parser.print_help()
|
||||
return
|
||||
|
||||
# Echo
|
||||
if Arguments().echo:
|
||||
if len(Arguments().video_ids) > 1:
|
||||
print("When using --echo option, only one video ID can be specified.")
|
||||
return
|
||||
try:
|
||||
Echo(Arguments().video_ids[0]).run()
|
||||
except InvalidVideoIdException as e:
|
||||
print("Invalid video id:", str(e))
|
||||
except Exception as e:
|
||||
print(type(e), str(e))
|
||||
if Arguments().debug:
|
||||
raise
|
||||
finally:
|
||||
return
|
||||
|
||||
# Extractor
|
||||
if Arguments().video_ids:
|
||||
for video_id in Arguments().video_ids:
|
||||
if '[' in video_id:
|
||||
video_id = video_id.replace('[', '').replace(']', '')
|
||||
try:
|
||||
video_id = extract_video_id(video_id)
|
||||
if os.path.exists(Arguments().output):
|
||||
path = Path(Arguments().output + video_id + '.html')
|
||||
else:
|
||||
raise FileNotFoundError
|
||||
info = VideoInfo(video_id)
|
||||
print(f"Extracting...\n"
|
||||
f" video_id: {video_id}\n"
|
||||
f" channel: {info.get_channel_name()}\n"
|
||||
f" title: {info.get_title()}")
|
||||
|
||||
print(f" output path: {path.resolve()}")
|
||||
Extractor(video_id,
|
||||
processor=HTMLArchiver(
|
||||
Arguments().output + video_id + '.html'),
|
||||
callback=_disp_progress
|
||||
).extract()
|
||||
print("\nExtraction end.\n")
|
||||
except InvalidVideoIdException:
|
||||
print("Invalid Video ID or URL:", video_id)
|
||||
except (TypeError, NoContents) as e:
|
||||
print(e)
|
||||
except FileNotFoundError:
|
||||
print("The specified directory does not exist.:{}".format(Arguments().output))
|
||||
except VideoInfoParseException:
|
||||
print("Cannot parse video information.:{}".format(video_id))
|
||||
if not os.path.exists(Arguments().output):
|
||||
print("\nThe specified directory does not exist.:{}\n".format(Arguments().output))
|
||||
return
|
||||
parser.print_help()
|
||||
|
||||
|
||||
def _disp_progress(a, b):
|
||||
print('.', end="", flush=True)
|
||||
try:
|
||||
CLIExtractor().run()
|
||||
except CancelledError as e:
|
||||
print(str(e))
|
||||
|
||||
@@ -18,6 +18,8 @@ class Arguments(metaclass=Singleton):
|
||||
VERSION: str = 'version'
|
||||
OUTPUT: str = 'output_dir'
|
||||
VIDEO_IDS: str = 'video_id'
|
||||
DEBUG: bool = 'debug'
|
||||
ECHO: bool = 'echo'
|
||||
|
||||
def __init__(self,
|
||||
arguments: Optional[Dict[str, Union[str, bool, int]]] = None):
|
||||
@@ -34,10 +36,10 @@ class Arguments(metaclass=Singleton):
|
||||
self.print_version: bool = arguments[Arguments.Name.VERSION]
|
||||
self.output: str = arguments[Arguments.Name.OUTPUT]
|
||||
self.video_ids: List[int] = []
|
||||
self.debug: bool = arguments[Arguments.Name.DEBUG]
|
||||
self.echo: bool = arguments[Arguments.Name.ECHO]
|
||||
|
||||
# Videos
|
||||
if arguments[Arguments.Name.VIDEO_IDS]:
|
||||
self.video_ids = [video_id
|
||||
for video_id in arguments[Arguments.Name.VIDEO_IDS].split(',')]
|
||||
|
||||
|
||||
|
||||
|
||||
121
pytchat/cli/cli_extractor.py
Normal file
121
pytchat/cli/cli_extractor.py
Normal file
@@ -0,0 +1,121 @@
|
||||
import asyncio
|
||||
import os
|
||||
import signal
|
||||
import traceback
|
||||
from httpcore import ReadTimeout as HCReadTimeout, NetworkError as HCNetworkError
|
||||
from json.decoder import JSONDecodeError
|
||||
from pathlib import Path
|
||||
from .arguments import Arguments
|
||||
from .progressbar import ProgressBar
|
||||
from .. import util
|
||||
from .. exceptions import InvalidVideoIdException, NoContents, PatternUnmatchError, UnknownConnectionError
|
||||
from .. processors.html_archiver import HTMLArchiver
|
||||
from .. tool.extract.extractor import Extractor
|
||||
from .. tool.videoinfo import VideoInfo
|
||||
from .. util.extract_video_id import extract_video_id
|
||||
|
||||
|
||||
class CLIExtractor:
|
||||
|
||||
def run(self) -> None:
|
||||
ex = None
|
||||
pbar = None
|
||||
for counter, video_id in enumerate(Arguments().video_ids):
|
||||
if len(Arguments().video_ids) > 1:
|
||||
print(f"\n{'-' * 10} video:{counter + 1} of {len(Arguments().video_ids)} {'-' * 10}")
|
||||
|
||||
try:
|
||||
video_id = extract_video_id(video_id)
|
||||
separated_path = str(Path(Arguments().output)) + os.path.sep
|
||||
path = util.checkpath(separated_path + video_id + '.html')
|
||||
try:
|
||||
info = VideoInfo(video_id)
|
||||
except (PatternUnmatchError, JSONDecodeError) as e:
|
||||
print("Cannot parse video information.:{} {}".format(video_id, type(e)))
|
||||
if Arguments().debug:
|
||||
util.save(str(e.doc), "ERR", ".dat")
|
||||
continue
|
||||
except Exception as e:
|
||||
print("Cannot parse video information.:{} {}".format(video_id, type(e)))
|
||||
continue
|
||||
|
||||
print(f"\n"
|
||||
f" video_id: {video_id}\n"
|
||||
f" channel: {info.get_channel_name()}\n"
|
||||
f" title: {info.get_title()}\n"
|
||||
f" output path: {path}")
|
||||
|
||||
duration = info.get_duration()
|
||||
pbar = ProgressBar(total=(duration * 1000), status_txt="Extracting")
|
||||
ex = Extractor(video_id,
|
||||
callback=pbar.disp,
|
||||
div=10)
|
||||
signal.signal(signal.SIGINT, (lambda a, b: self.cancel(ex, pbar)))
|
||||
|
||||
data = ex.extract()
|
||||
if data == [] or data is None:
|
||||
continue
|
||||
pbar.reset("#", "=", total=1000, status_txt="Rendering ")
|
||||
processor = HTMLArchiver(path, callback=pbar.disp)
|
||||
processor.process(
|
||||
[{'video_id': None,
|
||||
'timeout': 1,
|
||||
'chatdata': (action["replayChatItemAction"]["actions"][0] for action in data)}]
|
||||
)
|
||||
processor.finalize()
|
||||
pbar.reset('#', '#', status_txt='Completed ')
|
||||
pbar.close()
|
||||
print()
|
||||
if pbar.is_cancelled():
|
||||
print("\nThe extraction process has been discontinued.\n")
|
||||
except InvalidVideoIdException:
|
||||
print("Invalid Video ID or URL:", video_id)
|
||||
except NoContents as e:
|
||||
print(f"Abort:{str(e)}:[{video_id}]")
|
||||
except (JSONDecodeError, PatternUnmatchError) as e:
|
||||
print("{}:{}".format(e.msg, video_id))
|
||||
if Arguments().debug:
|
||||
filename = util.save(e.doc, "ERR_", ".dat")
|
||||
traceback.print_exc()
|
||||
print(f"Saved error data: {filename}")
|
||||
except (UnknownConnectionError, HCNetworkError, HCReadTimeout) as e:
|
||||
if Arguments().debug:
|
||||
traceback.print_exc()
|
||||
print(f"An unknown network error occurred during the processing of [{video_id}]. : " + str(e))
|
||||
except Exception as e:
|
||||
print(f"Abort:{str(type(e))} {str(e)[:80]}")
|
||||
if Arguments().debug:
|
||||
traceback.print_exc()
|
||||
finally:
|
||||
clear_tasks()
|
||||
|
||||
return
|
||||
|
||||
def cancel(self, ex=None, pbar=None) -> None:
|
||||
'''Called when keyboard interrupted has occurred.
|
||||
'''
|
||||
print("\nKeyboard interrupted.\n")
|
||||
if ex and pbar:
|
||||
ex.cancel()
|
||||
pbar.cancel()
|
||||
|
||||
|
||||
def clear_tasks():
|
||||
'''
|
||||
Clear remained tasks.
|
||||
Called when internal exception has occurred or
|
||||
after each extraction process is completed.
|
||||
'''
|
||||
async def _shutdown():
|
||||
tasks = [t for t in asyncio.all_tasks()
|
||||
if t is not asyncio.current_task()]
|
||||
for task in tasks:
|
||||
task.cancel()
|
||||
|
||||
try:
|
||||
loop = asyncio.get_event_loop()
|
||||
loop.run_until_complete(_shutdown())
|
||||
except Exception as e:
|
||||
print(str(e))
|
||||
if Arguments().debug:
|
||||
traceback.print_exc()
|
||||
22
pytchat/cli/echo.py
Normal file
22
pytchat/cli/echo.py
Normal file
@@ -0,0 +1,22 @@
|
||||
import pytchat
|
||||
from ..exceptions import ChatDataFinished, NoContents
|
||||
from ..util.extract_video_id import extract_video_id
|
||||
|
||||
|
||||
class Echo:
|
||||
def __init__(self, video_id):
|
||||
self.video_id = extract_video_id(video_id)
|
||||
|
||||
def run(self):
|
||||
livechat = pytchat.create(self.video_id)
|
||||
while livechat.is_alive():
|
||||
chatdata = livechat.get()
|
||||
for c in chatdata.sync_items():
|
||||
print(f"{c.datetime} [{c.author.name}] {c.message} {c.amountString}")
|
||||
|
||||
try:
|
||||
livechat.raise_for_status()
|
||||
except (ChatDataFinished, NoContents):
|
||||
print("Chat finished.")
|
||||
except Exception as e:
|
||||
print(type(e), str(e))
|
||||
54
pytchat/cli/progressbar.py
Normal file
54
pytchat/cli/progressbar.py
Normal file
@@ -0,0 +1,54 @@
|
||||
'''
|
||||
This code is based on
|
||||
vladignatyev/progress.py
|
||||
https://gist.github.com/vladignatyev/06860ec2040cb497f0f3
|
||||
(MIT License)
|
||||
'''
|
||||
import shutil
|
||||
import sys
|
||||
|
||||
|
||||
class ProgressBar:
|
||||
def __init__(self, total, status_txt):
|
||||
self._bar_len = 60
|
||||
self._cancelled = False
|
||||
self.reset(total=total, status_txt=status_txt)
|
||||
|
||||
def reset(self, symbol_done="=", symbol_space=" ", total=100, status_txt=''):
|
||||
self._console_width = shutil.get_terminal_size(fallback=(80, 24)).columns
|
||||
self._symbol_done = symbol_done
|
||||
self._symbol_space = symbol_space
|
||||
self._total = total
|
||||
self._status_txt = status_txt
|
||||
self._count = 0
|
||||
|
||||
def disp(self, _, fetched):
|
||||
self._progress(fetched, self._total)
|
||||
|
||||
def _progress(self, fillin, total):
|
||||
if total == 0 or self._cancelled:
|
||||
return
|
||||
self._count += fillin
|
||||
filled_len = int(round(self._bar_len * self._count / float(total)))
|
||||
percents = round(100.0 * self._count / float(total), 1)
|
||||
if percents > 100:
|
||||
percents = 100.0
|
||||
if filled_len > self._bar_len:
|
||||
filled_len = self._bar_len
|
||||
|
||||
bar = self._symbol_done * filled_len + \
|
||||
self._symbol_space * (self._bar_len - filled_len)
|
||||
disp = f" [{bar}] {percents:>5.1f}% ...{self._status_txt} "[:self._console_width - 1] + '\r'
|
||||
|
||||
sys.stdout.write(disp)
|
||||
sys.stdout.flush()
|
||||
|
||||
def close(self):
|
||||
if not self._cancelled:
|
||||
self._progress(self._total, self._total)
|
||||
|
||||
def cancel(self):
|
||||
self._cancelled = True
|
||||
|
||||
def is_cancelled(self):
|
||||
return self._cancelled
|
||||
@@ -1,7 +1,8 @@
|
||||
import logging
|
||||
import logging # noqa
|
||||
from . import mylogger
|
||||
headers = {
|
||||
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.106 Safari/537.36'}
|
||||
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.135 Safari/537.36',
|
||||
}
|
||||
|
||||
|
||||
def logger(module_name: str, loglevel=None):
|
||||
|
||||
7
pytchat/core/__init__.py
Normal file
7
pytchat/core/__init__.py
Normal file
@@ -0,0 +1,7 @@
|
||||
from .pytchat import PytchatCore
|
||||
from .. util.extract_video_id import extract_video_id
|
||||
|
||||
|
||||
def create(video_id: str, **kwargs):
|
||||
_vid = extract_video_id(video_id)
|
||||
return PytchatCore(_vid, **kwargs)
|
||||
204
pytchat/core/pytchat.py
Normal file
204
pytchat/core/pytchat.py
Normal file
@@ -0,0 +1,204 @@
|
||||
import httpx
|
||||
import json
|
||||
import signal
|
||||
import time
|
||||
import traceback
|
||||
import urllib.parse
|
||||
from ..parser.live import Parser
|
||||
from .. import config
|
||||
from .. import exceptions
|
||||
from ..paramgen import liveparam, arcparam
|
||||
from ..processors.default.processor import DefaultProcessor
|
||||
from ..processors.combinator import Combinator
|
||||
from ..util.extract_video_id import extract_video_id
|
||||
|
||||
headers = config.headers
|
||||
MAX_RETRY = 10
|
||||
|
||||
|
||||
class PytchatCore:
|
||||
'''
|
||||
|
||||
Parameter
|
||||
---------
|
||||
video_id : str
|
||||
|
||||
seektime : int
|
||||
start position of fetching chat (seconds).
|
||||
This option is valid for archived chat only.
|
||||
If negative value, chat data posted before the start of the broadcast
|
||||
will be retrieved as well.
|
||||
|
||||
processor : ChatProcessor
|
||||
|
||||
interruptable : bool
|
||||
Allows keyboard interrupts.
|
||||
Set this parameter to False if your own threading program causes
|
||||
the problem.
|
||||
|
||||
force_replay : bool
|
||||
force to fetch archived chat data, even if specified video is live.
|
||||
|
||||
topchat_only : bool
|
||||
If True, get only top chat.
|
||||
|
||||
hold_exception : bool [default:True]
|
||||
If True, when exceptions occur, the exception is held internally,
|
||||
and can be raised by raise_for_status().
|
||||
|
||||
Attributes
|
||||
---------
|
||||
_is_alive : bool
|
||||
Flag to stop getting chat.
|
||||
'''
|
||||
|
||||
_setup_finished = False
|
||||
|
||||
def __init__(self, video_id,
|
||||
seektime=-1,
|
||||
processor=DefaultProcessor(),
|
||||
interruptable=True,
|
||||
force_replay=False,
|
||||
topchat_only=False,
|
||||
hold_exception=True,
|
||||
logger=config.logger(__name__),
|
||||
):
|
||||
self._video_id = extract_video_id(video_id)
|
||||
self.seektime = seektime
|
||||
if isinstance(processor, tuple):
|
||||
self.processor = Combinator(processor)
|
||||
else:
|
||||
self.processor = processor
|
||||
self._is_alive = True
|
||||
self._is_replay = force_replay
|
||||
self._hold_exception = hold_exception
|
||||
self._exception_holder = None
|
||||
self._parser = Parser(
|
||||
is_replay=self._is_replay,
|
||||
exception_holder=self._exception_holder
|
||||
)
|
||||
self._first_fetch = True
|
||||
self._fetch_url = "live_chat/get_live_chat?continuation="
|
||||
self._topchat_only = topchat_only
|
||||
self._logger = logger
|
||||
if interruptable:
|
||||
signal.signal(signal.SIGINT, lambda a, b: self.terminate())
|
||||
self._setup()
|
||||
|
||||
def _setup(self):
|
||||
time.sleep(0.1) # sleep shortly to prohibit skipping fetching data
|
||||
"""Fetch first continuation parameter,
|
||||
create and start _listen loop.
|
||||
"""
|
||||
self.continuation = liveparam.getparam(self._video_id, 3)
|
||||
|
||||
def _get_chat_component(self):
|
||||
|
||||
''' Fetch chat data and store them into buffer,
|
||||
get next continuaiton parameter and loop.
|
||||
|
||||
Parameter
|
||||
---------
|
||||
continuation : str
|
||||
parameter for next chat data
|
||||
'''
|
||||
try:
|
||||
with httpx.Client(http2=True) as client:
|
||||
if self.continuation and self._is_alive:
|
||||
contents = self._get_contents(self.continuation, client, headers)
|
||||
metadata, chatdata = self._parser.parse(contents)
|
||||
timeout = metadata['timeoutMs'] / 1000
|
||||
chat_component = {
|
||||
"video_id": self._video_id,
|
||||
"timeout": timeout,
|
||||
"chatdata": chatdata
|
||||
}
|
||||
self.continuation = metadata.get('continuation')
|
||||
return chat_component
|
||||
except exceptions.ChatParseException as e:
|
||||
self._logger.debug(f"[{self._video_id}]{str(e)}")
|
||||
self._raise_exception(e)
|
||||
except Exception as e:
|
||||
self._logger.error(f"{traceback.format_exc(limit=-1)}")
|
||||
self._raise_exception(e)
|
||||
|
||||
def _get_contents(self, continuation, client, headers):
|
||||
'''Get 'continuationContents' from livechat json.
|
||||
If contents is None at first fetching,
|
||||
try to fetch archive chat data.
|
||||
|
||||
Return:
|
||||
-------
|
||||
'continuationContents' which includes metadata & chat data.
|
||||
'''
|
||||
livechat_json = (
|
||||
self._get_livechat_json(continuation, client, headers)
|
||||
)
|
||||
contents = self._parser.get_contents(livechat_json)
|
||||
if self._first_fetch:
|
||||
if contents is None or self._is_replay:
|
||||
'''Try to fetch archive chat data.'''
|
||||
self._parser.is_replay = True
|
||||
self._fetch_url = "live_chat_replay/get_live_chat_replay?continuation="
|
||||
continuation = arcparam.getparam(
|
||||
self._video_id, self.seektime, self._topchat_only)
|
||||
livechat_json = (self._get_livechat_json(continuation, client, headers))
|
||||
reload_continuation = self._parser.reload_continuation(
|
||||
self._parser.get_contents(livechat_json))
|
||||
if reload_continuation:
|
||||
livechat_json = (self._get_livechat_json(
|
||||
reload_continuation, client, headers))
|
||||
contents = self._parser.get_contents(livechat_json)
|
||||
self._is_replay = True
|
||||
self._first_fetch = False
|
||||
return contents
|
||||
|
||||
def _get_livechat_json(self, continuation, client, headers):
|
||||
'''
|
||||
Get json which includes chat data.
|
||||
'''
|
||||
continuation = urllib.parse.quote(continuation)
|
||||
livechat_json = None
|
||||
err = None
|
||||
url = f"https://www.youtube.com/{self._fetch_url}{continuation}&pbj=1"
|
||||
for _ in range(MAX_RETRY + 1):
|
||||
with client:
|
||||
try:
|
||||
livechat_json = client.get(url, headers=headers).json()
|
||||
break
|
||||
except (json.JSONDecodeError, httpx.ConnectTimeout, httpx.ReadTimeout, httpx.ConnectError) as e:
|
||||
err = e
|
||||
time.sleep(2)
|
||||
continue
|
||||
else:
|
||||
self._logger.error(f"[{self._video_id}]"
|
||||
f"Exceeded retry count. Last error: {str(err)}")
|
||||
self._raise_exception(exceptions.RetryExceedMaxCount())
|
||||
return livechat_json
|
||||
|
||||
def get(self):
|
||||
if self.is_alive():
|
||||
chat_component = self._get_chat_component()
|
||||
return self.processor.process([chat_component])
|
||||
else:
|
||||
return []
|
||||
|
||||
def is_replay(self):
|
||||
return self._is_replay
|
||||
|
||||
def is_alive(self):
|
||||
return self._is_alive
|
||||
|
||||
def terminate(self):
|
||||
self._is_alive = False
|
||||
self.processor.finalize()
|
||||
|
||||
def raise_for_status(self):
|
||||
if self._exception_holder is not None:
|
||||
raise self._exception_holder
|
||||
|
||||
def _raise_exception(self, exception: Exception = None):
|
||||
self.terminate()
|
||||
if self._hold_exception is False:
|
||||
raise exception
|
||||
self._exception_holder = exception
|
||||
@@ -4,13 +4,13 @@ import asyncio
|
||||
|
||||
class Buffer(asyncio.Queue):
|
||||
'''
|
||||
チャットデータを格納するバッファの役割を持つFIFOキュー
|
||||
Buffer for storing chat data.
|
||||
|
||||
Parameter
|
||||
---------
|
||||
maxsize : int
|
||||
格納するチャットブロックの最大個数。0の場合は無限。
|
||||
最大値を超える場合は古いチャットブロックから破棄される。
|
||||
Maximum number of chat blocks to be stored.
|
||||
If it exceeds the maximum, the oldest chat block will be discarded.
|
||||
'''
|
||||
|
||||
def __init__(self, maxsize=0):
|
||||
|
||||
@@ -1,13 +1,13 @@
|
||||
import aiohttp
|
||||
|
||||
import asyncio
|
||||
import httpx
|
||||
import json
|
||||
import signal
|
||||
import time
|
||||
import traceback
|
||||
import urllib.parse
|
||||
from aiohttp.client_exceptions import ClientConnectorError
|
||||
from concurrent.futures import CancelledError
|
||||
from asyncio import Queue
|
||||
from concurrent.futures import CancelledError
|
||||
from .buffer import Buffer
|
||||
from ..parser.live import Parser
|
||||
from .. import config
|
||||
@@ -22,54 +22,51 @@ MAX_RETRY = 10
|
||||
|
||||
|
||||
class LiveChatAsync:
|
||||
'''asyncio(aiohttp)を利用してYouTubeのライブ配信のチャットデータを取得する。
|
||||
'''LiveChatAsync object fetches chat data and stores them
|
||||
in a buffer with asyncio.
|
||||
|
||||
Parameter
|
||||
---------
|
||||
video_id : str
|
||||
動画ID
|
||||
|
||||
seektime : int
|
||||
(ライブチャット取得時は無視)
|
||||
取得開始するアーカイブ済みチャットの経過時間(秒)
|
||||
マイナス値を指定した場合は、配信開始前のチャットも取得する。
|
||||
start position of fetching chat (seconds).
|
||||
This option is valid for archived chat only.
|
||||
If negative value, chat data posted before the start of the broadcast
|
||||
will be retrieved as well.
|
||||
|
||||
processor : ChatProcessor
|
||||
チャットデータを加工するオブジェクト
|
||||
|
||||
buffer : Buffer(maxsize:20[default])
|
||||
チャットデータchat_componentを格納するバッファ。
|
||||
maxsize : 格納できるchat_componentの個数
|
||||
default値20個。1個で約5~10秒分。
|
||||
buffer : Buffer
|
||||
buffer of chat data fetched background.
|
||||
|
||||
interruptable : bool
|
||||
Ctrl+Cによる処理中断を行うかどうか。
|
||||
Allows keyboard interrupts.
|
||||
Set this parameter to False if your own threading program causes
|
||||
the problem.
|
||||
|
||||
callback : func
|
||||
_listen()関数から一定間隔で自動的に呼びだす関数。
|
||||
function called periodically from _listen().
|
||||
|
||||
done_callback : func
|
||||
listener終了時に呼び出すコールバック。
|
||||
function called when listener ends.
|
||||
|
||||
exception_handler : func
|
||||
例外を処理する関数
|
||||
|
||||
direct_mode : bool
|
||||
Trueの場合、bufferを使わずにcallbackを呼ぶ。
|
||||
Trueの場合、callbackの設定が必須
|
||||
(設定していない場合IllegalFunctionCall例外を発生させる)
|
||||
If True, invoke specified callback function without using buffer.
|
||||
callback is required. If not, IllegalFunctionCall will be raised.
|
||||
|
||||
force_replay : bool
|
||||
Trueの場合、ライブチャットが取得できる場合であっても
|
||||
強制的にアーカイブ済みチャットを取得する。
|
||||
force to fetch archived chat data, even if specified video is live.
|
||||
|
||||
topchat_only : bool
|
||||
Trueの場合、上位チャットのみ取得する。
|
||||
If True, get only top chat.
|
||||
|
||||
Attributes
|
||||
---------
|
||||
_is_alive : bool
|
||||
チャット取得を停止するためのフラグ
|
||||
Flag to stop getting chat.
|
||||
'''
|
||||
|
||||
_setup_finished = False
|
||||
@@ -114,31 +111,30 @@ class LiveChatAsync:
|
||||
self._set_exception_handler(exception_handler)
|
||||
if interruptable:
|
||||
signal.signal(signal.SIGINT,
|
||||
(lambda a, b: asyncio.create_task(
|
||||
LiveChatAsync.shutdown(None, signal.SIGINT, b))))
|
||||
(lambda a, b: self._keyboard_interrupt()))
|
||||
self._setup()
|
||||
|
||||
def _setup(self):
|
||||
# direct modeがTrueでcallback未設定の場合例外発生。
|
||||
# An exception is raised when direct mode is true and no callback is set.
|
||||
if self._direct_mode:
|
||||
if self._callback is None:
|
||||
raise exceptions.IllegalFunctionCall(
|
||||
"When direct_mode=True, callback parameter is required.")
|
||||
else:
|
||||
# direct modeがFalseでbufferが未設定ならばデフォルトのbufferを作成
|
||||
# Create a default buffer if `direct_mode` is False and buffer is not set.
|
||||
if self._buffer is None:
|
||||
self._buffer = Buffer(maxsize=20)
|
||||
# callbackが指定されている場合はcallbackを呼ぶループタスクを作成
|
||||
# Create a loop task to call callback if the `callback` param is specified.
|
||||
if self._callback is None:
|
||||
pass
|
||||
else:
|
||||
# callbackを呼ぶループタスクの開始
|
||||
# Create a loop task to call callback if the `callback` param is specified.
|
||||
loop = asyncio.get_event_loop()
|
||||
loop.create_task(self._callback_loop(self._callback))
|
||||
# _listenループタスクの開始
|
||||
# Start a loop task for _listen()
|
||||
loop = asyncio.get_event_loop()
|
||||
self.listen_task = loop.create_task(self._startlisten())
|
||||
# add_done_callbackの登録
|
||||
# Register add_done_callback
|
||||
if self._done_callback is None:
|
||||
self.listen_task.add_done_callback(self._finish)
|
||||
else:
|
||||
@@ -161,11 +157,11 @@ class LiveChatAsync:
|
||||
parameter for next chat data
|
||||
'''
|
||||
try:
|
||||
async with aiohttp.ClientSession() as session:
|
||||
async with httpx.AsyncClient(http2=True) as client:
|
||||
while(continuation and self._is_alive):
|
||||
continuation = await self._check_pause(continuation)
|
||||
contents = await self._get_contents(
|
||||
continuation, session, headers)
|
||||
continuation, client, headers)
|
||||
metadata, chatdata = self._parser.parse(contents)
|
||||
|
||||
timeout = metadata['timeoutMs'] / 1000
|
||||
@@ -190,12 +186,12 @@ class LiveChatAsync:
|
||||
except exceptions.ChatParseException as e:
|
||||
self._logger.debug(f"[{self._video_id}]{str(e)}")
|
||||
raise
|
||||
except (TypeError, json.JSONDecodeError):
|
||||
except Exception:
|
||||
self._logger.error(f"{traceback.format_exc(limit = -1)}")
|
||||
raise
|
||||
|
||||
self._logger.debug(f"[{self._video_id}]finished fetching chat.")
|
||||
raise exceptions.ChatDataFinished
|
||||
self._logger.debug(f"[{self._video_id}] finished fetching chat.")
|
||||
|
||||
|
||||
async def _check_pause(self, continuation):
|
||||
if self._pauser.empty():
|
||||
@@ -210,7 +206,7 @@ class LiveChatAsync:
|
||||
self._video_id, 3, self._topchat_only)
|
||||
return continuation
|
||||
|
||||
async def _get_contents(self, continuation, session, headers):
|
||||
async def _get_contents(self, continuation, client, headers):
|
||||
'''Get 'continuationContents' from livechat json.
|
||||
If contents is None at first fetching,
|
||||
try to fetch archive chat data.
|
||||
@@ -219,7 +215,7 @@ class LiveChatAsync:
|
||||
-------
|
||||
'continuationContents' which includes metadata & chatdata.
|
||||
'''
|
||||
livechat_json = await self._get_livechat_json(continuation, session, headers)
|
||||
livechat_json = await self._get_livechat_json(continuation, client, headers)
|
||||
contents = self._parser.get_contents(livechat_json)
|
||||
if self._first_fetch:
|
||||
if contents is None or self._is_replay:
|
||||
@@ -229,48 +225,47 @@ class LiveChatAsync:
|
||||
continuation = arcparam.getparam(
|
||||
self._video_id, self.seektime, self._topchat_only)
|
||||
livechat_json = (await self._get_livechat_json(
|
||||
continuation, session, headers))
|
||||
continuation, client, headers))
|
||||
reload_continuation = self._parser.reload_continuation(
|
||||
self._parser.get_contents(livechat_json))
|
||||
if reload_continuation:
|
||||
livechat_json = (await self._get_livechat_json(
|
||||
reload_continuation, session, headers))
|
||||
reload_continuation, client, headers))
|
||||
contents = self._parser.get_contents(livechat_json)
|
||||
self._is_replay = True
|
||||
self._first_fetch = False
|
||||
return contents
|
||||
|
||||
async def _get_livechat_json(self, continuation, session, headers):
|
||||
async def _get_livechat_json(self, continuation, client, headers):
|
||||
'''
|
||||
Get json which includes chat data.
|
||||
'''
|
||||
continuation = urllib.parse.quote(continuation)
|
||||
livechat_json = None
|
||||
status_code = 0
|
||||
url = f"https://www.youtube.com/{self._fetch_url}{continuation}&pbj=1"
|
||||
for _ in range(MAX_RETRY + 1):
|
||||
async with session.get(url, headers=headers) as resp:
|
||||
try:
|
||||
text = await resp.text()
|
||||
livechat_json = json.loads(text)
|
||||
break
|
||||
except (ClientConnectorError, json.JSONDecodeError):
|
||||
await asyncio.sleep(1)
|
||||
continue
|
||||
try:
|
||||
resp = await client.get(url, headers=headers)
|
||||
livechat_json = resp.json()
|
||||
break
|
||||
except (json.JSONDecodeError, httpx.HTTPError):
|
||||
await asyncio.sleep(1)
|
||||
continue
|
||||
else:
|
||||
self._logger.error(f"[{self._video_id}]"
|
||||
f"Exceeded retry count. status_code={status_code}")
|
||||
f"Exceeded retry count.")
|
||||
return None
|
||||
return livechat_json
|
||||
|
||||
async def _callback_loop(self, callback):
|
||||
""" コンストラクタでcallbackを指定している場合、バックグラウンドで
|
||||
callbackに指定された関数に一定間隔でチャットデータを投げる。
|
||||
""" If a callback is specified in the constructor,
|
||||
it throws chat data at regular intervals to the
|
||||
function specified in the callback in the backgroun
|
||||
|
||||
Parameter
|
||||
---------
|
||||
callback : func
|
||||
加工済みのチャットデータを渡す先の関数。
|
||||
function to which the processed chat data is passed.
|
||||
"""
|
||||
while self.is_alive():
|
||||
items = await self._buffer.get()
|
||||
@@ -281,11 +276,13 @@ class LiveChatAsync:
|
||||
await self._callback(processed_chat)
|
||||
|
||||
async def get(self):
|
||||
""" bufferからデータを取り出し、processorに投げ、
|
||||
加工済みのチャットデータを返す。
|
||||
"""
|
||||
Retrieves data from the buffer,
|
||||
throws it to the processor,
|
||||
and returns the processed chat data.
|
||||
|
||||
Returns
|
||||
: Processorによって加工されたチャットデータ
|
||||
: Chat data processed by the Processor
|
||||
"""
|
||||
if self._callback is None:
|
||||
if self.is_alive():
|
||||
@@ -294,7 +291,7 @@ class LiveChatAsync:
|
||||
else:
|
||||
return []
|
||||
raise exceptions.IllegalFunctionCall(
|
||||
"既にcallbackを登録済みのため、get()は実行できません。")
|
||||
"Callback parameter is already set, so get() cannot be performed.")
|
||||
|
||||
def is_replay(self):
|
||||
return self._is_replay
|
||||
@@ -315,11 +312,11 @@ class LiveChatAsync:
|
||||
return self._is_alive
|
||||
|
||||
def _finish(self, sender):
|
||||
'''Listener終了時のコールバック'''
|
||||
'''Called when the _listen() task finished.'''
|
||||
try:
|
||||
self._task_finished()
|
||||
except CancelledError:
|
||||
self._logger.debug(f'[{self._video_id}]cancelled:{sender}')
|
||||
self._logger.debug(f'[{self._video_id}] cancelled:{sender}')
|
||||
|
||||
def terminate(self):
|
||||
if self._pauser.empty():
|
||||
@@ -327,10 +324,14 @@ class LiveChatAsync:
|
||||
self._is_alive = False
|
||||
self._buffer.put_nowait({})
|
||||
self.processor.finalize()
|
||||
|
||||
|
||||
def _keyboard_interrupt(self):
|
||||
self.exception = exceptions.ChatDataFinished()
|
||||
self.terminate()
|
||||
|
||||
def _task_finished(self):
|
||||
'''
|
||||
Listenerを終了する。
|
||||
Terminate fetching chats.
|
||||
'''
|
||||
if self.is_alive():
|
||||
self.terminate()
|
||||
@@ -340,7 +341,7 @@ class LiveChatAsync:
|
||||
self.exception = e
|
||||
if not isinstance(e, exceptions.ChatParseException):
|
||||
self._logger.error(f'Internal exception - {type(e)}{str(e)}')
|
||||
self._logger.info(f'[{self._video_id}]終了しました')
|
||||
self._logger.info(f'[{self._video_id}] finished.')
|
||||
|
||||
def raise_for_status(self):
|
||||
if self.exception is not None:
|
||||
@@ -350,15 +351,3 @@ class LiveChatAsync:
|
||||
def _set_exception_handler(cls, handler):
|
||||
loop = asyncio.get_event_loop()
|
||||
loop.set_exception_handler(handler)
|
||||
|
||||
@classmethod
|
||||
async def shutdown(cls, event, sig=None, handler=None):
|
||||
cls._logger.debug("shutdown...")
|
||||
tasks = [t for t in asyncio.all_tasks() if t is not
|
||||
asyncio.current_task()]
|
||||
[task.cancel() for task in tasks]
|
||||
|
||||
cls._logger.debug("complete remaining tasks...")
|
||||
await asyncio.gather(*tasks, return_exceptions=True)
|
||||
loop = asyncio.get_event_loop()
|
||||
loop.stop()
|
||||
|
||||
@@ -4,13 +4,13 @@ import queue
|
||||
|
||||
class Buffer(queue.Queue):
|
||||
'''
|
||||
チャットデータを格納するバッファの役割を持つFIFOキュー
|
||||
Buffer for storing chat data.
|
||||
|
||||
Parameter
|
||||
---------
|
||||
max_size : int
|
||||
格納するチャットブロックの最大個数。0の場合は無限。
|
||||
最大値を超える場合は古いチャットブロックから破棄される。
|
||||
maxsize : int
|
||||
Maximum number of chat blocks to be stored.
|
||||
If it exceeds the maximum, the oldest chat block will be discarded.
|
||||
'''
|
||||
|
||||
def __init__(self, maxsize=0):
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
import requests
|
||||
import httpx
|
||||
import json
|
||||
import signal
|
||||
import time
|
||||
@@ -21,54 +21,53 @@ MAX_RETRY = 10
|
||||
|
||||
|
||||
class LiveChat:
|
||||
''' スレッドプールを利用してYouTubeのライブ配信のチャットデータを取得する
|
||||
'''
|
||||
LiveChat object fetches chat data and stores them
|
||||
in a buffer with ThreadpoolExecutor.
|
||||
|
||||
Parameter
|
||||
---------
|
||||
video_id : str
|
||||
動画ID
|
||||
|
||||
seektime : int
|
||||
(ライブチャット取得時は無視)
|
||||
取得開始するアーカイブ済みチャットの経過時間(秒)
|
||||
マイナス値を指定した場合は、配信開始前のチャットも取得する。
|
||||
start position of fetching chat (seconds).
|
||||
This option is valid for archived chat only.
|
||||
If negative value, chat data posted before the start of the broadcast
|
||||
will be retrieved as well.
|
||||
|
||||
processor : ChatProcessor
|
||||
チャットデータを加工するオブジェクト
|
||||
|
||||
buffer : Buffer(maxsize:20[default])
|
||||
チャットデータchat_componentを格納するバッファ。
|
||||
maxsize : 格納できるchat_componentの個数
|
||||
default値20個。1個で約5~10秒分。
|
||||
buffer : Buffer
|
||||
buffer of chat data fetched background.
|
||||
|
||||
interruptable : bool
|
||||
Ctrl+Cによる処理中断を行うかどうか。
|
||||
Allows keyboard interrupts.
|
||||
Set this parameter to False if your own threading program causes
|
||||
the problem.
|
||||
|
||||
callback : func
|
||||
_listen()関数から一定間隔で自動的に呼びだす関数。
|
||||
function called periodically from _listen().
|
||||
|
||||
done_callback : func
|
||||
listener終了時に呼び出すコールバック。
|
||||
function called when listener ends.
|
||||
|
||||
direct_mode : bool
|
||||
Trueの場合、bufferを使わずにcallbackを呼ぶ。
|
||||
Trueの場合、callbackの設定が必須
|
||||
(設定していない場合IllegalFunctionCall例外を発生させる)
|
||||
If True, invoke specified callback function without using buffer.
|
||||
callback is required. If not, IllegalFunctionCall will be raised.
|
||||
|
||||
force_replay : bool
|
||||
Trueの場合、ライブチャットが取得できる場合であっても
|
||||
強制的にアーカイブ済みチャットを取得する。
|
||||
force to fetch archived chat data, even if specified video is live.
|
||||
|
||||
topchat_only : bool
|
||||
Trueの場合、上位チャットのみ取得する。
|
||||
If True, get only top chat.
|
||||
|
||||
Attributes
|
||||
---------
|
||||
_executor : ThreadPoolExecutor
|
||||
チャットデータ取得ループ(_listen)用のスレッド
|
||||
This is used for _listen() loop.
|
||||
|
||||
_is_alive : bool
|
||||
チャット取得を停止するためのフラグ
|
||||
Flag to stop getting chat.
|
||||
'''
|
||||
|
||||
_setup_finished = False
|
||||
@@ -112,24 +111,24 @@ class LiveChat:
|
||||
self._setup()
|
||||
|
||||
def _setup(self):
|
||||
# direct modeがTrueでcallback未設定の場合例外発生。
|
||||
# An exception is raised when direct mode is true and no callback is set.
|
||||
if self._direct_mode:
|
||||
if self._callback is None:
|
||||
raise exceptions.IllegalFunctionCall(
|
||||
"When direct_mode=True, callback parameter is required.")
|
||||
else:
|
||||
# direct modeがFalseでbufferが未設定ならばデフォルトのbufferを作成
|
||||
# Create a default buffer if `direct_mode` is False and buffer is not set.
|
||||
if self._buffer is None:
|
||||
self._buffer = Buffer(maxsize=20)
|
||||
# callbackが指定されている場合はcallbackを呼ぶループタスクを作成
|
||||
# Create a loop task to call callback if the `callback` param is specified.
|
||||
if self._callback is None:
|
||||
pass
|
||||
else:
|
||||
# callbackを呼ぶループタスクの開始
|
||||
# Start a loop task calling callback function.
|
||||
self._executor.submit(self._callback_loop, self._callback)
|
||||
# _listenループタスクの開始
|
||||
# Start a loop task for _listen()
|
||||
self.listen_task = self._executor.submit(self._startlisten)
|
||||
# add_done_callbackの登録
|
||||
# Register add_done_callback
|
||||
if self._done_callback is None:
|
||||
self.listen_task.add_done_callback(self._finish)
|
||||
else:
|
||||
@@ -153,10 +152,10 @@ class LiveChat:
|
||||
parameter for next chat data
|
||||
'''
|
||||
try:
|
||||
with requests.Session() as session:
|
||||
with httpx.Client(http2=True) as client:
|
||||
while(continuation and self._is_alive):
|
||||
continuation = self._check_pause(continuation)
|
||||
contents = self._get_contents(continuation, session, headers)
|
||||
contents = self._get_contents(continuation, client, headers)
|
||||
metadata, chatdata = self._parser.parse(contents)
|
||||
timeout = metadata['timeoutMs'] / 1000
|
||||
chat_component = {
|
||||
@@ -180,12 +179,12 @@ class LiveChat:
|
||||
except exceptions.ChatParseException as e:
|
||||
self._logger.debug(f"[{self._video_id}]{str(e)}")
|
||||
raise
|
||||
except (TypeError, json.JSONDecodeError):
|
||||
except Exception:
|
||||
self._logger.error(f"{traceback.format_exc(limit=-1)}")
|
||||
raise
|
||||
|
||||
self._logger.debug(f"[{self._video_id}]finished fetching chat.")
|
||||
raise exceptions.ChatDataFinished
|
||||
self._logger.debug(f"[{self._video_id}] finished fetching chat.")
|
||||
|
||||
|
||||
def _check_pause(self, continuation):
|
||||
if self._pauser.empty():
|
||||
@@ -199,7 +198,7 @@ class LiveChat:
|
||||
continuation = liveparam.getparam(self._video_id, 3)
|
||||
return continuation
|
||||
|
||||
def _get_contents(self, continuation, session, headers):
|
||||
def _get_contents(self, continuation, client, headers):
|
||||
'''Get 'continuationContents' from livechat json.
|
||||
If contents is None at first fetching,
|
||||
try to fetch archive chat data.
|
||||
@@ -209,7 +208,7 @@ class LiveChat:
|
||||
'continuationContents' which includes metadata & chat data.
|
||||
'''
|
||||
livechat_json = (
|
||||
self._get_livechat_json(continuation, session, headers)
|
||||
self._get_livechat_json(continuation, client, headers)
|
||||
)
|
||||
contents = self._parser.get_contents(livechat_json)
|
||||
if self._first_fetch:
|
||||
@@ -219,48 +218,47 @@ class LiveChat:
|
||||
self._fetch_url = "live_chat_replay/get_live_chat_replay?continuation="
|
||||
continuation = arcparam.getparam(
|
||||
self._video_id, self.seektime, self._topchat_only)
|
||||
livechat_json = (self._get_livechat_json(continuation, session, headers))
|
||||
livechat_json = (self._get_livechat_json(continuation, client, headers))
|
||||
reload_continuation = self._parser.reload_continuation(
|
||||
self._parser.get_contents(livechat_json))
|
||||
if reload_continuation:
|
||||
livechat_json = (self._get_livechat_json(
|
||||
reload_continuation, session, headers))
|
||||
reload_continuation, client, headers))
|
||||
contents = self._parser.get_contents(livechat_json)
|
||||
self._is_replay = True
|
||||
self._first_fetch = False
|
||||
return contents
|
||||
|
||||
def _get_livechat_json(self, continuation, session, headers):
|
||||
def _get_livechat_json(self, continuation, client, headers):
|
||||
'''
|
||||
Get json which includes chat data.
|
||||
'''
|
||||
continuation = urllib.parse.quote(continuation)
|
||||
livechat_json = None
|
||||
status_code = 0
|
||||
url = f"https://www.youtube.com/{self._fetch_url}{continuation}&pbj=1"
|
||||
for _ in range(MAX_RETRY + 1):
|
||||
with session.get(url, headers=headers) as resp:
|
||||
with client:
|
||||
try:
|
||||
text = resp.text
|
||||
livechat_json = json.loads(text)
|
||||
livechat_json = client.get(url, headers=headers).json()
|
||||
break
|
||||
except json.JSONDecodeError:
|
||||
time.sleep(1)
|
||||
except (json.JSONDecodeError, httpx.HTTPError):
|
||||
time.sleep(2)
|
||||
continue
|
||||
else:
|
||||
self._logger.error(f"[{self._video_id}]"
|
||||
f"Exceeded retry count. status_code={status_code}")
|
||||
f"Exceeded retry count.")
|
||||
raise exceptions.RetryExceedMaxCount()
|
||||
return livechat_json
|
||||
|
||||
def _callback_loop(self, callback):
|
||||
""" コンストラクタでcallbackを指定している場合、バックグラウンドで
|
||||
callbackに指定された関数に一定間隔でチャットデータを投げる。
|
||||
""" If a callback is specified in the constructor,
|
||||
it throws chat data at regular intervals to the
|
||||
function specified in the callback in the backgroun
|
||||
|
||||
Parameter
|
||||
---------
|
||||
callback : func
|
||||
加工済みのチャットデータを渡す先の関数。
|
||||
function to which the processed chat data is passed.
|
||||
"""
|
||||
while self.is_alive():
|
||||
items = self._buffer.get()
|
||||
@@ -271,11 +269,13 @@ class LiveChat:
|
||||
self._callback(processed_chat)
|
||||
|
||||
def get(self):
|
||||
""" bufferからデータを取り出し、processorに投げ、
|
||||
加工済みのチャットデータを返す。
|
||||
"""
|
||||
Retrieves data from the buffer,
|
||||
throws it to the processor,
|
||||
and returns the processed chat data.
|
||||
|
||||
Returns
|
||||
: Processorによって加工されたチャットデータ
|
||||
: Chat data processed by the Processor
|
||||
"""
|
||||
if self._callback is None:
|
||||
if self.is_alive():
|
||||
@@ -284,7 +284,7 @@ class LiveChat:
|
||||
else:
|
||||
return []
|
||||
raise exceptions.IllegalFunctionCall(
|
||||
"既にcallbackを登録済みのため、get()は実行できません。")
|
||||
"Callback parameter is already set, so get() cannot be performed.")
|
||||
|
||||
def is_replay(self):
|
||||
return self._is_replay
|
||||
@@ -305,13 +305,16 @@ class LiveChat:
|
||||
return self._is_alive
|
||||
|
||||
def _finish(self, sender):
|
||||
'''Listener終了時のコールバック'''
|
||||
'''Called when the _listen() task finished.'''
|
||||
try:
|
||||
self._task_finished()
|
||||
except CancelledError:
|
||||
self._logger.debug(f'[{self._video_id}]cancelled:{sender}')
|
||||
self._logger.debug(f'[{self._video_id}] cancelled:{sender}')
|
||||
|
||||
def terminate(self):
|
||||
'''
|
||||
Terminate fetching chats.
|
||||
'''
|
||||
if self._pauser.empty():
|
||||
self._pauser.put_nowait(None)
|
||||
self._is_alive = False
|
||||
@@ -320,9 +323,6 @@ class LiveChat:
|
||||
self.processor.finalize()
|
||||
|
||||
def _task_finished(self):
|
||||
'''
|
||||
Listenerを終了する。
|
||||
'''
|
||||
if self.is_alive():
|
||||
self.terminate()
|
||||
try:
|
||||
@@ -331,7 +331,7 @@ class LiveChat:
|
||||
self.exception = e
|
||||
if not isinstance(e, exceptions.ChatParseException):
|
||||
self._logger.error(f'Internal exception - {type(e)}{str(e)}')
|
||||
self._logger.info(f'[{self._video_id}]終了しました')
|
||||
self._logger.info(f'[{self._video_id}] finished.')
|
||||
|
||||
def raise_for_status(self):
|
||||
if self.exception is not None:
|
||||
|
||||
@@ -38,7 +38,9 @@ class InvalidVideoIdException(Exception):
|
||||
'''
|
||||
Thrown when the video_id is not exist (VideoInfo).
|
||||
'''
|
||||
pass
|
||||
def __init__(self, doc):
|
||||
self.msg = "InvalidVideoIdException"
|
||||
self.doc = doc
|
||||
|
||||
|
||||
class UnknownConnectionError(Exception):
|
||||
@@ -47,7 +49,7 @@ class UnknownConnectionError(Exception):
|
||||
|
||||
class RetryExceedMaxCount(Exception):
|
||||
'''
|
||||
thrown when the number of retries exceeds the maximum value.
|
||||
Thrown when the number of retries exceeds the maximum value.
|
||||
'''
|
||||
pass
|
||||
|
||||
@@ -64,7 +66,16 @@ class FailedExtractContinuation(ChatDataFinished):
|
||||
pass
|
||||
|
||||
|
||||
class VideoInfoParseException(Exception):
|
||||
class VideoInfoParseError(Exception):
|
||||
'''
|
||||
thrown when failed to parse video info
|
||||
Base exception when parsing video info.
|
||||
'''
|
||||
|
||||
|
||||
class PatternUnmatchError(VideoInfoParseError):
|
||||
'''
|
||||
Thrown when failed to parse video info with unmatched pattern.
|
||||
'''
|
||||
def __init__(self, doc=''):
|
||||
self.msg = "PatternUnmatchError"
|
||||
self.doc = doc
|
||||
|
||||
@@ -1,133 +0,0 @@
|
||||
from base64 import urlsafe_b64encode as b64enc
|
||||
from functools import reduce
|
||||
import urllib.parse
|
||||
|
||||
'''
|
||||
Generate continuation parameter of youtube replay chat.
|
||||
|
||||
Author: taizan-hokuto (2019) @taizan205
|
||||
|
||||
ver 0.0.1 2019.10.05
|
||||
'''
|
||||
|
||||
|
||||
def _gen_vid_long(video_id):
|
||||
"""generate video_id parameter.
|
||||
Parameter
|
||||
---------
|
||||
video_id : str
|
||||
|
||||
Return
|
||||
---------
|
||||
byte[] : base64 encoded video_id parameter.
|
||||
"""
|
||||
header_magic = b'\x0A\x0F\x1A\x0D\x0A'
|
||||
header_id = video_id.encode()
|
||||
header_sep_1 = b'\x1A\x13\xEA\xA8\xDD\xB9\x01\x0D\x0A\x0B'
|
||||
header_terminator = b'\x20\x01'
|
||||
|
||||
item = [
|
||||
header_magic,
|
||||
_nval(len(header_id)),
|
||||
header_id,
|
||||
header_sep_1,
|
||||
header_id,
|
||||
header_terminator
|
||||
]
|
||||
|
||||
return urllib.parse.quote(
|
||||
b64enc(reduce(lambda x, y: x + y, item)).decode()
|
||||
).encode()
|
||||
|
||||
|
||||
def _gen_vid(video_id):
|
||||
"""generate video_id parameter.
|
||||
Parameter
|
||||
---------
|
||||
video_id : str
|
||||
|
||||
Return
|
||||
---------
|
||||
bytes : base64 encoded video_id parameter.
|
||||
"""
|
||||
header_magic = b'\x0A\x0F\x1A\x0D\x0A'
|
||||
header_id = video_id.encode()
|
||||
header_terminator = b'\x20\x01'
|
||||
|
||||
item = [
|
||||
header_magic,
|
||||
_nval(len(header_id)),
|
||||
header_id,
|
||||
header_terminator
|
||||
]
|
||||
|
||||
return urllib.parse.quote(
|
||||
b64enc(reduce(lambda x, y: x + y, item)).decode()
|
||||
).encode()
|
||||
|
||||
|
||||
def _nval(val):
|
||||
"""convert value to byte array"""
|
||||
if val < 0:
|
||||
raise ValueError
|
||||
buf = b''
|
||||
while val >> 7:
|
||||
m = val & 0xFF | 0x80
|
||||
buf += m.to_bytes(1, 'big')
|
||||
val >>= 7
|
||||
buf += val.to_bytes(1, 'big')
|
||||
return buf
|
||||
|
||||
|
||||
def _build(video_id, seektime, topchat_only):
|
||||
switch_01 = b'\x04' if topchat_only else b'\x01'
|
||||
if seektime < 0:
|
||||
raise ValueError("seektime must be greater than or equal to zero.")
|
||||
if seektime == 0:
|
||||
times = b''
|
||||
else:
|
||||
times = _nval(int(seektime * 1000))
|
||||
if seektime > 0:
|
||||
_len_time = b'\x5A' + (len(times) + 1).to_bytes(1, 'big') + b'\x10'
|
||||
else:
|
||||
_len_time = b''
|
||||
|
||||
header_magic = b'\xA2\x9D\xB0\xD3\x04'
|
||||
sep_0 = b'\x1A'
|
||||
vid = _gen_vid(video_id)
|
||||
_tag = b'\x40\x01'
|
||||
timestamp1 = times
|
||||
sep_1 = b'\x60\x04\x72\x02\x08'
|
||||
terminator = b'\x78\x01'
|
||||
|
||||
body = [
|
||||
sep_0,
|
||||
_nval(len(vid)),
|
||||
vid,
|
||||
_tag,
|
||||
_len_time,
|
||||
timestamp1,
|
||||
sep_1,
|
||||
switch_01,
|
||||
terminator
|
||||
]
|
||||
|
||||
body = reduce(lambda x, y: x + y, body)
|
||||
|
||||
return urllib.parse.quote(
|
||||
b64enc(header_magic + _nval(len(body)) + body
|
||||
).decode()
|
||||
)
|
||||
|
||||
|
||||
def getparam(video_id, seektime=0.0, topchat_only=False):
|
||||
'''
|
||||
Parameter
|
||||
---------
|
||||
seektime : int
|
||||
unit:seconds
|
||||
start position of fetching chat data.
|
||||
topchat_only : bool
|
||||
if True, fetch only 'top chat'
|
||||
'''
|
||||
return _build(video_id, seektime, topchat_only)
|
||||
@@ -8,15 +8,26 @@ from .. import exceptions
|
||||
|
||||
|
||||
class Parser:
|
||||
'''
|
||||
Parser of chat json.
|
||||
|
||||
Parameter
|
||||
----------
|
||||
is_replay : bool
|
||||
|
||||
__slots__ = ['is_replay']
|
||||
exception_holder : Object [default:Npne]
|
||||
The object holding exceptions.
|
||||
This is passed from the parent livechat object.
|
||||
'''
|
||||
__slots__ = ['is_replay', 'exception_holder']
|
||||
|
||||
def __init__(self, is_replay):
|
||||
def __init__(self, is_replay, exception_holder=None):
|
||||
self.is_replay = is_replay
|
||||
self.exception_holder = exception_holder
|
||||
|
||||
def get_contents(self, jsn):
|
||||
if jsn is None:
|
||||
raise exceptions.IllegalFunctionCall('Called with none JSON object.')
|
||||
self.raise_exception(exceptions.IllegalFunctionCall('Called with none JSON object.'))
|
||||
if jsn['response']['responseContext'].get('errors'):
|
||||
raise exceptions.ResponseContextError(
|
||||
'The video_id would be wrong, or video is deleted or private.')
|
||||
@@ -42,11 +53,11 @@ class Parser:
|
||||
|
||||
if contents is None:
|
||||
'''Broadcasting end or cannot fetch chat stream'''
|
||||
raise exceptions.NoContents('Chat data stream is empty.')
|
||||
self.raise_exception(exceptions.NoContents('Chat data stream is empty.'))
|
||||
|
||||
cont = contents['liveChatContinuation']['continuations'][0]
|
||||
if cont is None:
|
||||
raise exceptions.NoContinuation('No Continuation')
|
||||
self.raise_exception(exceptions.NoContinuation('No Continuation'))
|
||||
metadata = (cont.get('invalidationContinuationData')
|
||||
or cont.get('timedContinuationData')
|
||||
or cont.get('reloadContinuationData')
|
||||
@@ -54,13 +65,13 @@ class Parser:
|
||||
)
|
||||
if metadata is None:
|
||||
if cont.get("playerSeekContinuationData"):
|
||||
raise exceptions.ChatDataFinished('Finished chat data')
|
||||
self.raise_exception(exceptions.ChatDataFinished('Finished chat data'))
|
||||
unknown = list(cont.keys())[0]
|
||||
if unknown:
|
||||
raise exceptions.ReceivedUnknownContinuation(
|
||||
f"Received unknown continuation type:{unknown}")
|
||||
self.raise_exception(exceptions.ReceivedUnknownContinuation(
|
||||
f"Received unknown continuation type:{unknown}"))
|
||||
else:
|
||||
raise exceptions.FailedExtractContinuation('Cannot extract continuation data')
|
||||
self.raise_exception(exceptions.FailedExtractContinuation('Cannot extract continuation data'))
|
||||
return self._create_data(metadata, contents)
|
||||
|
||||
def reload_continuation(self, contents):
|
||||
@@ -72,7 +83,7 @@ class Parser:
|
||||
"""
|
||||
if contents is None:
|
||||
'''Broadcasting end or cannot fetch chat stream'''
|
||||
raise exceptions.NoContents('Chat data stream is empty.')
|
||||
self.raise_exception(exceptions.NoContents('Chat data stream is empty.'))
|
||||
cont = contents['liveChatContinuation']['continuations'][0]
|
||||
if cont.get("liveChatReplayContinuationData"):
|
||||
# chat data exist.
|
||||
@@ -81,7 +92,7 @@ class Parser:
|
||||
init_cont = cont.get("playerSeekContinuationData")
|
||||
if init_cont:
|
||||
return init_cont.get("continuation")
|
||||
raise exceptions.ChatDataFinished('Finished chat data')
|
||||
self.raise_exception(exceptions.ChatDataFinished('Finished chat data'))
|
||||
|
||||
def _create_data(self, metadata, contents):
|
||||
actions = contents['liveChatContinuation'].get('actions')
|
||||
@@ -103,3 +114,8 @@ class Parser:
|
||||
start = int(actions[0]["replayChatItemAction"]["videoOffsetTimeMsec"])
|
||||
last = int(actions[-1]["replayChatItemAction"]["videoOffsetTimeMsec"])
|
||||
return (last - start)
|
||||
|
||||
def raise_exception(self, exception):
|
||||
if self.exception_holder is None:
|
||||
raise exception
|
||||
self.exception_holder = exception
|
||||
|
||||
@@ -36,3 +36,7 @@ class Combinator(ChatProcessor):
|
||||
'''
|
||||
return tuple(processor.process(chat_components)
|
||||
for processor in self.processors)
|
||||
|
||||
def finalize(self, *args, **kwargs):
|
||||
[processor.finalize(*args, **kwargs)
|
||||
for processor in self.processors]
|
||||
|
||||
11
pytchat/processors/default/custom_encoder.py
Normal file
11
pytchat/processors/default/custom_encoder.py
Normal file
@@ -0,0 +1,11 @@
|
||||
import json
|
||||
from .renderer.base import Author
|
||||
from .renderer.paidmessage import Colors
|
||||
from .renderer.paidsticker import Colors2
|
||||
|
||||
|
||||
class CustomEncoder(json.JSONEncoder):
|
||||
def default(self, obj):
|
||||
if isinstance(obj, Author) or isinstance(obj, Colors) or isinstance(obj, Colors2):
|
||||
return vars(obj)
|
||||
return json.JSONEncoder.default(self, obj)
|
||||
@@ -1,5 +1,7 @@
|
||||
import asyncio
|
||||
import json
|
||||
import time
|
||||
from .custom_encoder import CustomEncoder
|
||||
from .renderer.textmessage import LiveChatTextMessageRenderer
|
||||
from .renderer.paidmessage import LiveChatPaidMessageRenderer
|
||||
from .renderer.paidsticker import LiveChatPaidStickerRenderer
|
||||
@@ -11,25 +13,120 @@ from ... import config
|
||||
logger = config.logger(__name__)
|
||||
|
||||
|
||||
class Chat:
|
||||
def json(self) -> str:
|
||||
return json.dumps(vars(self), ensure_ascii=False, cls=CustomEncoder)
|
||||
|
||||
|
||||
class Chatdata:
|
||||
def __init__(self, chatlist: list, timeout: float):
|
||||
|
||||
def __init__(self, chatlist: list, timeout: float, abs_diff):
|
||||
self.items = chatlist
|
||||
self.interval = timeout
|
||||
self.abs_diff = abs_diff
|
||||
self.itemcount = 0
|
||||
|
||||
def tick(self):
|
||||
if self.interval == 0:
|
||||
'''DEPRECATE
|
||||
Use sync_items()
|
||||
'''
|
||||
if len(self.items) < 1:
|
||||
time.sleep(1)
|
||||
return
|
||||
time.sleep(self.interval / len(self.items))
|
||||
if self.itemcount == 0:
|
||||
self.starttime = time.time()
|
||||
if len(self.items) == 1:
|
||||
total_itemcount = 1
|
||||
else:
|
||||
total_itemcount = len(self.items) - 1
|
||||
next_chattime = (self.items[0].timestamp + (self.items[-1].timestamp - self.items[0].timestamp) / total_itemcount * self.itemcount) / 1000
|
||||
tobe_disptime = self.abs_diff + next_chattime
|
||||
wait_sec = tobe_disptime - time.time()
|
||||
self.itemcount += 1
|
||||
|
||||
if wait_sec < 0:
|
||||
wait_sec = 0
|
||||
|
||||
time.sleep(wait_sec)
|
||||
|
||||
async def tick_async(self):
|
||||
if self.interval == 0:
|
||||
'''DEPRECATE
|
||||
Use async_items()
|
||||
'''
|
||||
if len(self.items) < 1:
|
||||
await asyncio.sleep(1)
|
||||
return
|
||||
await asyncio.sleep(self.interval / len(self.items))
|
||||
if self.itemcount == 0:
|
||||
self.starttime = time.time()
|
||||
if len(self.items) == 1:
|
||||
total_itemcount = 1
|
||||
else:
|
||||
total_itemcount = len(self.items) - 1
|
||||
next_chattime = (self.items[0].timestamp + (self.items[-1].timestamp - self.items[0].timestamp) / total_itemcount * self.itemcount) / 1000
|
||||
tobe_disptime = self.abs_diff + next_chattime
|
||||
wait_sec = tobe_disptime - time.time()
|
||||
self.itemcount += 1
|
||||
|
||||
if wait_sec < 0:
|
||||
wait_sec = 0
|
||||
|
||||
await asyncio.sleep(wait_sec)
|
||||
|
||||
def sync_items(self):
|
||||
starttime = time.time()
|
||||
if len(self.items) > 0:
|
||||
last_chattime = self.items[-1].timestamp / 1000
|
||||
tobe_disptime = self.abs_diff + last_chattime
|
||||
wait_total_sec = max(tobe_disptime - time.time(), 0)
|
||||
if len(self.items) > 1:
|
||||
wait_sec = wait_total_sec / len(self.items)
|
||||
elif len(self.items) == 1:
|
||||
wait_sec = 0
|
||||
for c in self.items:
|
||||
if wait_sec < 0:
|
||||
wait_sec = 0
|
||||
time.sleep(wait_sec)
|
||||
yield c
|
||||
stop_interval = time.time() - starttime
|
||||
if stop_interval < 1:
|
||||
time.sleep(1 - stop_interval)
|
||||
|
||||
async def async_items(self):
|
||||
starttime = time.time()
|
||||
if len(self.items) > 0:
|
||||
last_chattime = self.items[-1].timestamp / 1000
|
||||
tobe_disptime = self.abs_diff + last_chattime
|
||||
wait_total_sec = max(tobe_disptime - time.time(), 0)
|
||||
if len(self.items) > 1:
|
||||
wait_sec = wait_total_sec / len(self.items)
|
||||
elif len(self.items) == 1:
|
||||
wait_sec = 0
|
||||
for c in self.items:
|
||||
if wait_sec < 0:
|
||||
wait_sec = 0
|
||||
await asyncio.sleep(wait_sec)
|
||||
yield c
|
||||
|
||||
stop_interval = time.time() - starttime
|
||||
if stop_interval < 1:
|
||||
await asyncio.sleep(1 - stop_interval)
|
||||
|
||||
def json(self) -> str:
|
||||
return json.dumps([vars(a) for a in self.items], ensure_ascii=False, cls=CustomEncoder)
|
||||
|
||||
|
||||
class DefaultProcessor(ChatProcessor):
|
||||
def __init__(self):
|
||||
self.first = True
|
||||
self.abs_diff = 0
|
||||
self.renderers = {
|
||||
"liveChatTextMessageRenderer": LiveChatTextMessageRenderer(),
|
||||
"liveChatPaidMessageRenderer": LiveChatPaidMessageRenderer(),
|
||||
"liveChatPaidStickerRenderer": LiveChatPaidStickerRenderer(),
|
||||
"liveChatLegacyPaidMessageRenderer": LiveChatLegacyPaidMessageRenderer(),
|
||||
"liveChatMembershipItemRenderer": LiveChatMembershipItemRenderer()
|
||||
}
|
||||
|
||||
def process(self, chat_components: list):
|
||||
|
||||
chatlist = []
|
||||
@@ -37,8 +134,10 @@ class DefaultProcessor(ChatProcessor):
|
||||
|
||||
if chat_components:
|
||||
for component in chat_components:
|
||||
if component is None:
|
||||
continue
|
||||
timeout += component.get('timeout', 0)
|
||||
chatdata = component.get('chatdata')
|
||||
chatdata = component.get('chatdata') # if from Extractor, chatdata is generator.
|
||||
if chatdata is None:
|
||||
continue
|
||||
for action in chatdata:
|
||||
@@ -46,43 +145,35 @@ class DefaultProcessor(ChatProcessor):
|
||||
continue
|
||||
if action.get('addChatItemAction') is None:
|
||||
continue
|
||||
if action['addChatItemAction'].get('item') is None:
|
||||
item = action['addChatItemAction'].get('item')
|
||||
if item is None:
|
||||
continue
|
||||
|
||||
chat = self._parse(action)
|
||||
chat = self._parse(item)
|
||||
if chat:
|
||||
chatlist.append(chat)
|
||||
return Chatdata(chatlist, float(timeout))
|
||||
|
||||
if self.first and chatlist:
|
||||
self.abs_diff = time.time() - chatlist[0].timestamp / 1000
|
||||
self.first = False
|
||||
|
||||
def _parse(self, sitem):
|
||||
action = sitem.get("addChatItemAction")
|
||||
if action:
|
||||
item = action.get("item")
|
||||
if item is None:
|
||||
return None
|
||||
chatdata = Chatdata(chatlist, float(timeout), self.abs_diff)
|
||||
|
||||
return chatdata
|
||||
|
||||
def _parse(self, item):
|
||||
try:
|
||||
renderer = self._get_renderer(item)
|
||||
key = list(item.keys())[0]
|
||||
renderer = self.renderers.get(key)
|
||||
if renderer is None:
|
||||
return None
|
||||
|
||||
renderer.setitem(item.get(key), Chat())
|
||||
renderer.settype()
|
||||
renderer.get_snippet()
|
||||
renderer.get_authordetails()
|
||||
rendered_chatobj = renderer.get_chatobj()
|
||||
renderer.clear()
|
||||
except (KeyError, TypeError) as e:
|
||||
logger.error(f"{str(type(e))}-{str(e)} sitem:{str(sitem)}")
|
||||
logger.error(f"{str(type(e))}-{str(e)} item:{str(item)}")
|
||||
return None
|
||||
return renderer
|
||||
|
||||
def _get_renderer(self, item):
|
||||
if item.get("liveChatTextMessageRenderer"):
|
||||
renderer = LiveChatTextMessageRenderer(item)
|
||||
elif item.get("liveChatPaidMessageRenderer"):
|
||||
renderer = LiveChatPaidMessageRenderer(item)
|
||||
elif item.get("liveChatPaidStickerRenderer"):
|
||||
renderer = LiveChatPaidStickerRenderer(item)
|
||||
elif item.get("liveChatLegacyPaidMessageRenderer"):
|
||||
renderer = LiveChatLegacyPaidMessageRenderer(item)
|
||||
elif item.get("liveChatMembershipItemRenderer"):
|
||||
renderer = LiveChatMembershipItemRenderer(item)
|
||||
else:
|
||||
renderer = None
|
||||
return renderer
|
||||
|
||||
return rendered_chatobj
|
||||
|
||||
@@ -6,89 +6,96 @@ class Author:
|
||||
|
||||
|
||||
class BaseRenderer:
|
||||
def __init__(self, item, chattype):
|
||||
self.renderer = list(item.values())[0]
|
||||
self.chattype = chattype
|
||||
self.author = Author()
|
||||
def setitem(self, item, chat):
|
||||
self.item = item
|
||||
self.chat = chat
|
||||
self.chat.author = Author()
|
||||
|
||||
def settype(self):
|
||||
pass
|
||||
|
||||
def get_snippet(self):
|
||||
self.type = self.chattype
|
||||
self.id = self.renderer.get('id')
|
||||
timestampUsec = int(self.renderer.get("timestampUsec", 0))
|
||||
self.timestamp = int(timestampUsec / 1000)
|
||||
tst = self.renderer.get("timestampText")
|
||||
self.chat.id = self.item.get('id')
|
||||
timestampUsec = int(self.item.get("timestampUsec", 0))
|
||||
self.chat.timestamp = int(timestampUsec / 1000)
|
||||
tst = self.item.get("timestampText")
|
||||
if tst:
|
||||
self.elapsedTime = tst.get("simpleText")
|
||||
self.chat.elapsedTime = tst.get("simpleText")
|
||||
else:
|
||||
self.elapsedTime = ""
|
||||
self.datetime = self.get_datetime(timestampUsec)
|
||||
self.message, self.messageEx = self.get_message(self.renderer)
|
||||
self.id = self.renderer.get('id')
|
||||
self.amountValue = 0.0
|
||||
self.amountString = ""
|
||||
self.currency = ""
|
||||
self.bgColor = 0
|
||||
self.chat.elapsedTime = ""
|
||||
self.chat.datetime = self.get_datetime(timestampUsec)
|
||||
self.chat.message, self.chat.messageEx = self.get_message(self.item)
|
||||
self.chat.id = self.item.get('id')
|
||||
self.chat.amountValue = 0.0
|
||||
self.chat.amountString = ""
|
||||
self.chat.currency = ""
|
||||
self.chat.bgColor = 0
|
||||
|
||||
def get_authordetails(self):
|
||||
self.author.badgeUrl = ""
|
||||
(self.author.isVerified,
|
||||
self.author.isChatOwner,
|
||||
self.author.isChatSponsor,
|
||||
self.author.isChatModerator) = (
|
||||
self.get_badges(self.renderer)
|
||||
self.chat.author.badgeUrl = ""
|
||||
(self.chat.author.isVerified,
|
||||
self.chat.author.isChatOwner,
|
||||
self.chat.author.isChatSponsor,
|
||||
self.chat.author.isChatModerator) = (
|
||||
self.get_badges(self.item)
|
||||
)
|
||||
self.author.channelId = self.renderer.get("authorExternalChannelId")
|
||||
self.author.channelUrl = "http://www.youtube.com/channel/" + self.author.channelId
|
||||
self.author.name = self.renderer["authorName"]["simpleText"]
|
||||
self.author.imageUrl = self.renderer["authorPhoto"]["thumbnails"][1]["url"]
|
||||
self.chat.author.channelId = self.item.get("authorExternalChannelId")
|
||||
self.chat.author.channelUrl = "http://www.youtube.com/channel/" + self.chat.author.channelId
|
||||
self.chat.author.name = self.item["authorName"]["simpleText"]
|
||||
self.chat.author.imageUrl = self.item["authorPhoto"]["thumbnails"][1]["url"]
|
||||
|
||||
def get_message(self, renderer):
|
||||
def get_message(self, item):
|
||||
message = ''
|
||||
message_ex = []
|
||||
if renderer.get("message"):
|
||||
runs = renderer["message"].get("runs")
|
||||
if runs:
|
||||
for r in runs:
|
||||
if r:
|
||||
if r.get('emoji'):
|
||||
message += r['emoji'].get('shortcuts', [''])[0]
|
||||
message_ex.append({
|
||||
'id': r['emoji'].get('emojiId').split('/')[-1],
|
||||
'txt': r['emoji'].get('shortcuts', [''])[0],
|
||||
'url': r['emoji']['image']['thumbnails'][0].get('url')
|
||||
})
|
||||
else:
|
||||
message += r.get('text', '')
|
||||
message_ex.append(r.get('text', ''))
|
||||
runs = item.get("message", {}).get("runs", {})
|
||||
for r in runs:
|
||||
if not hasattr(r, "get"):
|
||||
continue
|
||||
if r.get('emoji'):
|
||||
message += r['emoji'].get('shortcuts', [''])[0]
|
||||
message_ex.append({
|
||||
'id': r['emoji'].get('emojiId').split('/')[-1],
|
||||
'txt': r['emoji'].get('shortcuts', [''])[0],
|
||||
'url': r['emoji']['image']['thumbnails'][0].get('url')
|
||||
})
|
||||
else:
|
||||
message += r.get('text', '')
|
||||
message_ex.append(r.get('text', ''))
|
||||
return message, message_ex
|
||||
|
||||
def get_badges(self, renderer):
|
||||
self.author.type = ''
|
||||
self.chat.author.type = ''
|
||||
isVerified = False
|
||||
isChatOwner = False
|
||||
isChatSponsor = False
|
||||
isChatModerator = False
|
||||
badges = renderer.get("authorBadges")
|
||||
if badges:
|
||||
for badge in badges:
|
||||
if badge["liveChatAuthorBadgeRenderer"].get("icon"):
|
||||
author_type = badge["liveChatAuthorBadgeRenderer"]["icon"]["iconType"]
|
||||
self.author.type = author_type
|
||||
if author_type == 'VERIFIED':
|
||||
isVerified = True
|
||||
if author_type == 'OWNER':
|
||||
isChatOwner = True
|
||||
if author_type == 'MODERATOR':
|
||||
isChatModerator = True
|
||||
if badge["liveChatAuthorBadgeRenderer"].get("customThumbnail"):
|
||||
isChatSponsor = True
|
||||
self.author.type = 'MEMBER'
|
||||
self.get_badgeurl(badge)
|
||||
badges = renderer.get("authorBadges", {})
|
||||
for badge in badges:
|
||||
if badge["liveChatAuthorBadgeRenderer"].get("icon"):
|
||||
author_type = badge["liveChatAuthorBadgeRenderer"]["icon"]["iconType"]
|
||||
self.chat.author.type = author_type
|
||||
if author_type == 'VERIFIED':
|
||||
isVerified = True
|
||||
if author_type == 'OWNER':
|
||||
isChatOwner = True
|
||||
if author_type == 'MODERATOR':
|
||||
isChatModerator = True
|
||||
if badge["liveChatAuthorBadgeRenderer"].get("customThumbnail"):
|
||||
isChatSponsor = True
|
||||
self.chat.author.type = 'MEMBER'
|
||||
self.get_badgeurl(badge)
|
||||
return isVerified, isChatOwner, isChatSponsor, isChatModerator
|
||||
|
||||
def get_badgeurl(self, badge):
|
||||
self.author.badgeUrl = badge["liveChatAuthorBadgeRenderer"]["customThumbnail"]["thumbnails"][0]["url"]
|
||||
self.chat.author.badgeUrl = badge["liveChatAuthorBadgeRenderer"]["customThumbnail"]["thumbnails"][0]["url"]
|
||||
|
||||
def get_datetime(self, timestamp):
|
||||
dt = datetime.fromtimestamp(timestamp / 1000000)
|
||||
return dt.strftime('%Y-%m-%d %H:%M:%S')
|
||||
|
||||
def get_chatobj(self):
|
||||
return self.chat
|
||||
|
||||
def clear(self):
|
||||
self.item = None
|
||||
self.chat = None
|
||||
|
||||
@@ -2,14 +2,14 @@ from .base import BaseRenderer
|
||||
|
||||
|
||||
class LiveChatLegacyPaidMessageRenderer(BaseRenderer):
|
||||
def __init__(self, item):
|
||||
super().__init__(item, "newSponsor")
|
||||
def settype(self):
|
||||
self.chat.type = "newSponsor"
|
||||
|
||||
def get_authordetails(self):
|
||||
super().get_authordetails()
|
||||
self.author.isChatSponsor = True
|
||||
self.chat.author.isChatSponsor = True
|
||||
|
||||
def get_message(self, renderer):
|
||||
message = (renderer["eventText"]["runs"][0]["text"]
|
||||
) + ' / ' + (renderer["detailText"]["simpleText"])
|
||||
def get_message(self, item):
|
||||
message = (item["eventText"]["runs"][0]["text"]
|
||||
) + ' / ' + (item["detailText"]["simpleText"])
|
||||
return message, [message]
|
||||
|
||||
@@ -2,14 +2,17 @@ from .base import BaseRenderer
|
||||
|
||||
|
||||
class LiveChatMembershipItemRenderer(BaseRenderer):
|
||||
def __init__(self, item):
|
||||
super().__init__(item, "newSponsor")
|
||||
def settype(self):
|
||||
self.chat.type = "newSponsor"
|
||||
|
||||
def get_authordetails(self):
|
||||
super().get_authordetails()
|
||||
self.author.isChatSponsor = True
|
||||
self.chat.author.isChatSponsor = True
|
||||
|
||||
def get_message(self, renderer):
|
||||
message = ''.join([mes.get("text", "")
|
||||
for mes in renderer["headerSubtext"]["runs"]])
|
||||
def get_message(self, item):
|
||||
try:
|
||||
message = ''.join([mes.get("text", "")
|
||||
for mes in item["headerSubtext"]["runs"]])
|
||||
except KeyError:
|
||||
return "Welcome New Member!", ["Welcome New Member!"]
|
||||
return message, [message]
|
||||
|
||||
@@ -9,23 +9,23 @@ class Colors:
|
||||
|
||||
|
||||
class LiveChatPaidMessageRenderer(BaseRenderer):
|
||||
def __init__(self, item):
|
||||
super().__init__(item, "superChat")
|
||||
def settype(self):
|
||||
self.chat.type = "superChat"
|
||||
|
||||
def get_snippet(self):
|
||||
super().get_snippet()
|
||||
amountDisplayString, symbol, amount = (
|
||||
self.get_amountdata(self.renderer)
|
||||
self.get_amountdata(self.item)
|
||||
)
|
||||
self.amountValue = amount
|
||||
self.amountString = amountDisplayString
|
||||
self.currency = currency.symbols[symbol]["fxtext"] if currency.symbols.get(
|
||||
self.chat.amountValue = amount
|
||||
self.chat.amountString = amountDisplayString
|
||||
self.chat.currency = currency.symbols[symbol]["fxtext"] if currency.symbols.get(
|
||||
symbol) else symbol
|
||||
self.bgColor = self.renderer.get("bodyBackgroundColor", 0)
|
||||
self.colors = self.get_colors()
|
||||
self.chat.bgColor = self.item.get("bodyBackgroundColor", 0)
|
||||
self.chat.colors = self.get_colors()
|
||||
|
||||
def get_amountdata(self, renderer):
|
||||
amountDisplayString = renderer["purchaseAmountText"]["simpleText"]
|
||||
def get_amountdata(self, item):
|
||||
amountDisplayString = item["purchaseAmountText"]["simpleText"]
|
||||
m = superchat_regex.search(amountDisplayString)
|
||||
if m:
|
||||
symbol = m.group(1)
|
||||
@@ -36,11 +36,12 @@ class LiveChatPaidMessageRenderer(BaseRenderer):
|
||||
return amountDisplayString, symbol, amount
|
||||
|
||||
def get_colors(self):
|
||||
item = self.item
|
||||
colors = Colors()
|
||||
colors.headerBackgroundColor = self.renderer.get("headerBackgroundColor", 0)
|
||||
colors.headerTextColor = self.renderer.get("headerTextColor", 0)
|
||||
colors.bodyBackgroundColor = self.renderer.get("bodyBackgroundColor", 0)
|
||||
colors.bodyTextColor = self.renderer.get("bodyTextColor", 0)
|
||||
colors.timestampColor = self.renderer.get("timestampColor", 0)
|
||||
colors.authorNameTextColor = self.renderer.get("authorNameTextColor", 0)
|
||||
colors.headerBackgroundColor = item.get("headerBackgroundColor", 0)
|
||||
colors.headerTextColor = item.get("headerTextColor", 0)
|
||||
colors.bodyBackgroundColor = item.get("bodyBackgroundColor", 0)
|
||||
colors.bodyTextColor = item.get("bodyTextColor", 0)
|
||||
colors.timestampColor = item.get("timestampColor", 0)
|
||||
colors.authorNameTextColor = item.get("authorNameTextColor", 0)
|
||||
return colors
|
||||
|
||||
@@ -4,30 +4,30 @@ from .base import BaseRenderer
|
||||
superchat_regex = re.compile(r"^(\D*)(\d{1,3}(,\d{3})*(\.\d*)*\b)$")
|
||||
|
||||
|
||||
class Colors:
|
||||
class Colors2:
|
||||
pass
|
||||
|
||||
|
||||
class LiveChatPaidStickerRenderer(BaseRenderer):
|
||||
def __init__(self, item):
|
||||
super().__init__(item, "superSticker")
|
||||
def settype(self):
|
||||
self.chat.type = "superSticker"
|
||||
|
||||
def get_snippet(self):
|
||||
super().get_snippet()
|
||||
amountDisplayString, symbol, amount = (
|
||||
self.get_amountdata(self.renderer)
|
||||
self.get_amountdata(self.item)
|
||||
)
|
||||
self.amountValue = amount
|
||||
self.amountString = amountDisplayString
|
||||
self.currency = currency.symbols[symbol]["fxtext"] if currency.symbols.get(
|
||||
self.chat.amountValue = amount
|
||||
self.chat.amountString = amountDisplayString
|
||||
self.chat.currency = currency.symbols[symbol]["fxtext"] if currency.symbols.get(
|
||||
symbol) else symbol
|
||||
self.bgColor = self.renderer.get("backgroundColor", 0)
|
||||
self.sticker = "".join(("https:",
|
||||
self.renderer["sticker"]["thumbnails"][0]["url"]))
|
||||
self.colors = self.get_colors()
|
||||
self.chat.bgColor = self.item.get("backgroundColor", 0)
|
||||
self.chat.sticker = "".join(("https:",
|
||||
self.item["sticker"]["thumbnails"][0]["url"]))
|
||||
self.chat.colors = self.get_colors()
|
||||
|
||||
def get_amountdata(self, renderer):
|
||||
amountDisplayString = renderer["purchaseAmountText"]["simpleText"]
|
||||
def get_amountdata(self, item):
|
||||
amountDisplayString = item["purchaseAmountText"]["simpleText"]
|
||||
m = superchat_regex.search(amountDisplayString)
|
||||
if m:
|
||||
symbol = m.group(1)
|
||||
@@ -38,9 +38,10 @@ class LiveChatPaidStickerRenderer(BaseRenderer):
|
||||
return amountDisplayString, symbol, amount
|
||||
|
||||
def get_colors(self):
|
||||
colors = Colors()
|
||||
colors.moneyChipBackgroundColor = self.renderer.get("moneyChipBackgroundColor", 0)
|
||||
colors.moneyChipTextColor = self.renderer.get("moneyChipTextColor", 0)
|
||||
colors.backgroundColor = self.renderer.get("backgroundColor", 0)
|
||||
colors.authorNameTextColor = self.renderer.get("authorNameTextColor", 0)
|
||||
item = self.item
|
||||
colors = Colors2()
|
||||
colors.moneyChipBackgroundColor = item.get("moneyChipBackgroundColor", 0)
|
||||
colors.moneyChipTextColor = item.get("moneyChipTextColor", 0)
|
||||
colors.backgroundColor = item.get("backgroundColor", 0)
|
||||
colors.authorNameTextColor = item.get("authorNameTextColor", 0)
|
||||
return colors
|
||||
|
||||
@@ -2,5 +2,5 @@ from .base import BaseRenderer
|
||||
|
||||
|
||||
class LiveChatTextMessageRenderer(BaseRenderer):
|
||||
def __init__(self, item):
|
||||
super().__init__(item, "textMessage")
|
||||
def settype(self):
|
||||
self.chat.type = "textMessage"
|
||||
|
||||
@@ -1,10 +1,13 @@
|
||||
import httpx
|
||||
import os
|
||||
import re
|
||||
import requests
|
||||
import time
|
||||
from base64 import standard_b64encode
|
||||
from concurrent.futures import ThreadPoolExecutor
|
||||
from .chat_processor import ChatProcessor
|
||||
from .default.processor import DefaultProcessor
|
||||
|
||||
from ..exceptions import UnknownConnectionError
|
||||
import tempfile
|
||||
|
||||
PATTERN = re.compile(r"(.*)\(([0-9]+)\)$")
|
||||
|
||||
@@ -43,20 +46,24 @@ class HTMLArchiver(ChatProcessor):
|
||||
'''
|
||||
HTMLArchiver saves chat data as HTML table format.
|
||||
'''
|
||||
def __init__(self, save_path):
|
||||
def __init__(self, save_path, callback=None):
|
||||
super().__init__()
|
||||
self.client = httpx.Client(http2=True)
|
||||
self.save_path = self._checkpath(save_path)
|
||||
self.processor = DefaultProcessor()
|
||||
self.emoji_table = {} # tuble for custom emojis. key: emoji_id, value: base64 encoded image binary.
|
||||
self.header = [HEADER_HTML]
|
||||
self.body = ['<body>\n', '<table class="css">\n', self._parse_table_header(fmt_headers)]
|
||||
self.emoji_table = {} # dict for custom emojis. key: emoji_id, value: base64 encoded image binary.
|
||||
self.callback = callback
|
||||
self.executor = ThreadPoolExecutor(max_workers=10)
|
||||
self.tmp_fp = tempfile.NamedTemporaryFile(mode="a", encoding="utf-8", delete=False)
|
||||
self.tmp_filename = self.tmp_fp.name
|
||||
self.counter = 0
|
||||
|
||||
def _checkpath(self, filepath):
|
||||
splitter = os.path.splitext(os.path.basename(filepath))
|
||||
body = splitter[0]
|
||||
extention = splitter[1]
|
||||
newpath = filepath
|
||||
counter = 0
|
||||
counter = 1
|
||||
while os.path.exists(newpath):
|
||||
match = re.search(PATTERN, body)
|
||||
if match:
|
||||
@@ -76,21 +83,26 @@ class HTMLArchiver(ChatProcessor):
|
||||
save_path : str :
|
||||
Actual save path of file.
|
||||
total_lines : int :
|
||||
count of total lines written to the file.
|
||||
Count of total lines written to the file.
|
||||
"""
|
||||
if chat_components is None or len(chat_components) == 0:
|
||||
return
|
||||
self.body.extend(
|
||||
(self._parse_html_line((
|
||||
c.datetime,
|
||||
c.elapsedTime,
|
||||
c.author.name,
|
||||
self._parse_message(c.messageEx),
|
||||
c.amountString,
|
||||
c.author.type,
|
||||
c.author.channelId)
|
||||
) for c in self.processor.process(chat_components).items)
|
||||
)
|
||||
return self.save_path ,self.counter
|
||||
for c in self.processor.process(chat_components).items:
|
||||
self.tmp_fp.write(
|
||||
self._parse_html_line((
|
||||
c.datetime,
|
||||
c.elapsedTime,
|
||||
c.author.name,
|
||||
self._parse_message(c.messageEx),
|
||||
c.amountString,
|
||||
c.author.type,
|
||||
c.author.channelId)
|
||||
)
|
||||
)
|
||||
if self.callback:
|
||||
self.callback(None, 1)
|
||||
self.counter += 1
|
||||
return self.save_path, self.counter
|
||||
|
||||
def _parse_html_line(self, raw_line):
|
||||
return ''.join(('<tr>',
|
||||
@@ -108,13 +120,23 @@ class HTMLArchiver(ChatProcessor):
|
||||
for item in message_items)
|
||||
|
||||
def _encode_img(self, url):
|
||||
resp = requests.get(url)
|
||||
err = None
|
||||
for _ in range(5):
|
||||
try:
|
||||
resp = self.client.get(url, timeout=30)
|
||||
break
|
||||
except httpx.HTTPError as e:
|
||||
err = e
|
||||
time.sleep(3)
|
||||
else:
|
||||
raise UnknownConnectionError(str(err))
|
||||
|
||||
return standard_b64encode(resp.content).decode()
|
||||
|
||||
def _set_emoji_table(self, item: dict):
|
||||
emoji_id = item['id']
|
||||
emoji_id = ''.join(('Z', item['id'])) if 48 <= ord(item['id'][0]) <= 57 else item['id']
|
||||
if emoji_id not in self.emoji_table:
|
||||
self.emoji_table.setdefault(emoji_id, self._encode_img(item['url']))
|
||||
self.emoji_table.setdefault(emoji_id, self.executor.submit(self._encode_img, item['url']))
|
||||
return emoji_id
|
||||
|
||||
def _stylecode(self, name, code, width, height):
|
||||
@@ -125,13 +147,24 @@ class HTMLArchiver(ChatProcessor):
|
||||
def _create_styles(self):
|
||||
return '\n'.join(('<style type="text/css">',
|
||||
TABLE_CSS,
|
||||
'\n'.join(self._stylecode(key, self.emoji_table[key], 24, 24)
|
||||
'\n'.join(self._stylecode(key, self.emoji_table[key].result(), 24, 24)
|
||||
for key in self.emoji_table.keys()),
|
||||
'</style>\n'))
|
||||
|
||||
def finalize(self):
|
||||
self.header.extend([self._create_styles(), '</head>\n'])
|
||||
self.body.extend(['</table>\n</body>'])
|
||||
with open(self.save_path, mode='a', encoding='utf-8') as f:
|
||||
f.writelines(self.header)
|
||||
f.writelines(self.body)
|
||||
if self.tmp_fp:
|
||||
self.tmp_fp.flush()
|
||||
self.tmp_fp = None
|
||||
with open(self.save_path, mode='w', encoding='utf-8') as outfile:
|
||||
# write header
|
||||
outfile.writelines((
|
||||
HEADER_HTML, self._create_styles(), '</head>\n',
|
||||
'<body>\n', '<table class="css">\n',
|
||||
self._parse_table_header(fmt_headers)))
|
||||
# write body
|
||||
fp = open(self.tmp_filename, mode="r", encoding="utf-8")
|
||||
for line in fp:
|
||||
outfile.write(line)
|
||||
outfile.write('</table>\n</body>\n</html>')
|
||||
fp.close()
|
||||
os.remove(self.tmp_filename)
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
import aiohttp
|
||||
import asyncio
|
||||
import json
|
||||
import httpx
|
||||
import socket
|
||||
from . import parser
|
||||
from . block import Block
|
||||
from . worker import ExtractWorker
|
||||
@@ -12,11 +12,15 @@ from concurrent.futures import CancelledError
|
||||
from json import JSONDecodeError
|
||||
from urllib.parse import quote
|
||||
|
||||
|
||||
headers = config.headers
|
||||
REPLAY_URL = "https://www.youtube.com/live_chat_replay/" \
|
||||
"get_live_chat_replay?continuation="
|
||||
MAX_RETRY_COUNT = 3
|
||||
|
||||
# Set to avoid duplicate parameters
|
||||
param_set = set()
|
||||
|
||||
|
||||
def _split(start, end, count, min_interval_sec=120):
|
||||
"""
|
||||
@@ -51,11 +55,12 @@ def _split(start, end, count, min_interval_sec=120):
|
||||
|
||||
|
||||
def ready_blocks(video_id, duration, div, callback):
|
||||
param_set.clear()
|
||||
if div <= 0:
|
||||
raise ValueError
|
||||
|
||||
async def _get_blocks(video_id, duration, div, callback):
|
||||
async with aiohttp.ClientSession() as session:
|
||||
async with httpx.AsyncClient(http2=True) as session:
|
||||
tasks = [_create_block(session, video_id, seektime, callback)
|
||||
for seektime in _split(-1, duration, div)]
|
||||
return await asyncio.gather(*tasks)
|
||||
@@ -63,17 +68,24 @@ def ready_blocks(video_id, duration, div, callback):
|
||||
async def _create_block(session, video_id, seektime, callback):
|
||||
continuation = arcparam.getparam(video_id, seektime=seektime)
|
||||
url = f"{REPLAY_URL}{quote(continuation)}&pbj=1"
|
||||
err = None
|
||||
for _ in range(MAX_RETRY_COUNT):
|
||||
try:
|
||||
async with session.get(url, headers=headers) as resp:
|
||||
text = await resp.text()
|
||||
next_continuation, actions = parser.parse(json.loads(text))
|
||||
if continuation in param_set:
|
||||
next_continuation, actions = None, []
|
||||
break
|
||||
param_set.add(continuation)
|
||||
resp = await session.get(url, headers=headers, timeout=10)
|
||||
next_continuation, actions = parser.parse(resp.json())
|
||||
break
|
||||
except JSONDecodeError:
|
||||
await asyncio.sleep(3)
|
||||
except httpx.HTTPError as e:
|
||||
err = e
|
||||
await asyncio.sleep(3)
|
||||
else:
|
||||
cancel()
|
||||
raise UnknownConnectionError("Abort: Unknown connection error.")
|
||||
raise UnknownConnectionError("Abort:" + str(err))
|
||||
|
||||
if actions:
|
||||
first = parser.get_offset(actions[0])
|
||||
@@ -106,23 +118,33 @@ def fetch_patch(callback, blocks, video_id):
|
||||
)
|
||||
for block in blocks
|
||||
]
|
||||
async with aiohttp.ClientSession() as session:
|
||||
async with httpx.AsyncClient() as session:
|
||||
tasks = [worker.run(session) for worker in workers]
|
||||
return await asyncio.gather(*tasks)
|
||||
|
||||
async def _fetch(continuation, session) -> Patch:
|
||||
url = f"{REPLAY_URL}{quote(continuation)}&pbj=1"
|
||||
err = None
|
||||
for _ in range(MAX_RETRY_COUNT):
|
||||
try:
|
||||
async with session.get(url, headers=config.headers) as resp:
|
||||
chat_json = await resp.text()
|
||||
continuation, actions = parser.parse(json.loads(chat_json))
|
||||
if continuation in param_set:
|
||||
continuation, actions = None, []
|
||||
break
|
||||
param_set.add(continuation)
|
||||
resp = await session.get(url, headers=config.headers)
|
||||
continuation, actions = parser.parse(resp.json())
|
||||
break
|
||||
except JSONDecodeError:
|
||||
await asyncio.sleep(3)
|
||||
except httpx.HTTPError as e:
|
||||
err = e
|
||||
await asyncio.sleep(3)
|
||||
except socket.error as error:
|
||||
print("socket error", error.errno)
|
||||
await asyncio.sleep(3)
|
||||
else:
|
||||
cancel()
|
||||
raise UnknownConnectionError("Abort: Unknown connection error.")
|
||||
raise UnknownConnectionError("Abort:" + str(err))
|
||||
|
||||
if actions:
|
||||
last = parser.get_offset(actions[-1])
|
||||
@@ -143,15 +165,10 @@ def fetch_patch(callback, blocks, video_id):
|
||||
|
||||
|
||||
async def _shutdown():
|
||||
print("\nshutdown...")
|
||||
tasks = [t for t in asyncio.all_tasks()
|
||||
if t is not asyncio.current_task()]
|
||||
for task in tasks:
|
||||
task.cancel()
|
||||
try:
|
||||
await task
|
||||
except asyncio.CancelledError:
|
||||
pass
|
||||
|
||||
|
||||
def cancel():
|
||||
|
||||
@@ -1,3 +1,4 @@
|
||||
from typing import Generator
|
||||
from . import asyncdl
|
||||
from . import duplcheck
|
||||
from .. videoinfo import VideoInfo
|
||||
@@ -60,11 +61,10 @@ class Extractor:
|
||||
self.blocks = duplcheck.remove_duplicate_tail(self.blocks)
|
||||
return self
|
||||
|
||||
def _combine(self):
|
||||
ret = []
|
||||
def _get_chatdata(self) -> Generator:
|
||||
for block in self.blocks:
|
||||
ret.extend(block.chat_data)
|
||||
return ret
|
||||
for chatdata in block.chat_data:
|
||||
yield chatdata
|
||||
|
||||
def _execute_extract_operations(self):
|
||||
return (
|
||||
@@ -74,7 +74,7 @@ class Extractor:
|
||||
._remove_overlap()
|
||||
._download_blocks()
|
||||
._remove_duplicate_tail()
|
||||
._combine()
|
||||
._get_chatdata()
|
||||
)
|
||||
|
||||
def extract(self):
|
||||
|
||||
@@ -42,10 +42,14 @@ def get_offset(item):
|
||||
|
||||
|
||||
def get_id(item):
|
||||
return list((list(item['replayChatItemAction']["actions"][0].values()
|
||||
)[0])['item'].values())[0].get('id')
|
||||
a = list(item['replayChatItemAction']["actions"][0].values())[0].get('item')
|
||||
if a:
|
||||
return list(a.values())[0].get('id')
|
||||
return None
|
||||
|
||||
|
||||
def get_type(item):
|
||||
return list((list(item['replayChatItemAction']["actions"][0].values()
|
||||
)[0])['item'].keys())[0]
|
||||
a = list(item['replayChatItemAction']["actions"][0].values())[0].get('item')
|
||||
if a:
|
||||
return list(a.keys())[0]
|
||||
return None
|
||||
|
||||
@@ -1,12 +1,12 @@
|
||||
from . block import Block
|
||||
from . patch import fill, split
|
||||
from ... paramgen import arcparam
|
||||
from typing import Tuple
|
||||
|
||||
|
||||
class ExtractWorker:
|
||||
"""
|
||||
ExtractWorker associates a download session with a block.
|
||||
|
||||
When the worker finishes fetching, the block
|
||||
being fetched is splitted and assigned the free worker.
|
||||
|
||||
@@ -76,7 +76,7 @@ def _search_new_block(worker) -> Block:
|
||||
return new_block
|
||||
|
||||
|
||||
def _get_undone_block(blocks) -> (int, Block):
|
||||
def _get_undone_block(blocks) -> Tuple[int, Block]:
|
||||
min_interval_ms = 120000
|
||||
max_remaining = 0
|
||||
undone_block = None
|
||||
|
||||
@@ -1,141 +0,0 @@
|
||||
|
||||
import aiohttp
|
||||
import asyncio
|
||||
import json
|
||||
from . import parser
|
||||
from . block import Block
|
||||
from . worker import ExtractWorker
|
||||
from . patch import Patch
|
||||
from ... import config
|
||||
from ... paramgen import arcparam_mining as arcparam
|
||||
from concurrent.futures import CancelledError
|
||||
from urllib.parse import quote
|
||||
|
||||
headers = config.headers
|
||||
REPLAY_URL = "https://www.youtube.com/live_chat_replay?continuation="
|
||||
INTERVAL = 1
|
||||
def _split(start, end, count, min_interval_sec = 120):
|
||||
"""
|
||||
Split section from `start` to `end` into `count` pieces,
|
||||
and returns the beginning of each piece.
|
||||
The `count` is adjusted so that the length of each piece
|
||||
is no smaller than `min_interval`.
|
||||
|
||||
Returns:
|
||||
--------
|
||||
List of the offset of each block's first chat data.
|
||||
"""
|
||||
|
||||
if not (isinstance(start,int) or isinstance(start,float)) or \
|
||||
not (isinstance(end,int) or isinstance(end,float)):
|
||||
raise ValueError("start/end must be int or float")
|
||||
if not isinstance(count,int):
|
||||
raise ValueError("count must be int")
|
||||
if start>end:
|
||||
raise ValueError("end must be equal to or greater than start.")
|
||||
if count<1:
|
||||
raise ValueError("count must be equal to or greater than 1.")
|
||||
if (end-start)/count < min_interval_sec:
|
||||
count = int((end-start)/min_interval_sec)
|
||||
if count == 0 : count = 1
|
||||
interval= (end-start)/count
|
||||
|
||||
if count == 1:
|
||||
return [start]
|
||||
return sorted( list(set( [int(start + interval*j)
|
||||
for j in range(count) ])))
|
||||
|
||||
def ready_blocks(video_id, duration, div, callback):
|
||||
if div <= 0: raise ValueError
|
||||
|
||||
async def _get_blocks( video_id, duration, div, callback):
|
||||
async with aiohttp.ClientSession() as session:
|
||||
tasks = [_create_block(session, video_id, seektime, callback)
|
||||
for seektime in _split(0, duration, div)]
|
||||
return await asyncio.gather(*tasks)
|
||||
|
||||
|
||||
|
||||
async def _create_block(session, video_id, seektime, callback):
|
||||
continuation = arcparam.getparam(video_id, seektime = seektime)
|
||||
url=(f"{REPLAY_URL}{quote(continuation)}&playerOffsetMs="
|
||||
f"{int(seektime*1000)}&hidden=false&pbj=1")
|
||||
async with session.get(url, headers = headers) as resp:
|
||||
chat_json = await resp.text()
|
||||
if chat_json is None:
|
||||
return
|
||||
continuation, actions = parser.parse(json.loads(chat_json)[1])
|
||||
first = seektime
|
||||
seektime += INTERVAL
|
||||
if callback:
|
||||
callback(actions, INTERVAL)
|
||||
return Block(
|
||||
continuation = continuation,
|
||||
chat_data = actions,
|
||||
first = first,
|
||||
last = seektime,
|
||||
seektime = seektime
|
||||
)
|
||||
"""
|
||||
fetch initial blocks.
|
||||
"""
|
||||
loop = asyncio.get_event_loop()
|
||||
blocks = loop.run_until_complete(
|
||||
_get_blocks(video_id, duration, div, callback))
|
||||
return blocks
|
||||
|
||||
def fetch_patch(callback, blocks, video_id):
|
||||
|
||||
async def _allocate_workers():
|
||||
workers = [
|
||||
ExtractWorker(
|
||||
fetch = _fetch, block = block,
|
||||
blocks = blocks, video_id = video_id
|
||||
)
|
||||
for block in blocks
|
||||
]
|
||||
async with aiohttp.ClientSession() as session:
|
||||
tasks = [worker.run(session) for worker in workers]
|
||||
return await asyncio.gather(*tasks)
|
||||
|
||||
async def _fetch(seektime,session) -> Patch:
|
||||
continuation = arcparam.getparam(video_id, seektime = seektime)
|
||||
url=(f"{REPLAY_URL}{quote(continuation)}&playerOffsetMs="
|
||||
f"{int(seektime*1000)}&hidden=false&pbj=1")
|
||||
async with session.get(url,headers = config.headers) as resp:
|
||||
chat_json = await resp.text()
|
||||
actions = []
|
||||
try:
|
||||
if chat_json is None:
|
||||
return Patch()
|
||||
continuation, actions = parser.parse(json.loads(chat_json)[1])
|
||||
except json.JSONDecodeError:
|
||||
pass
|
||||
if callback:
|
||||
callback(actions, INTERVAL)
|
||||
return Patch(chats = actions, continuation = continuation,
|
||||
seektime = seektime, last = seektime)
|
||||
"""
|
||||
allocate workers and assign blocks.
|
||||
"""
|
||||
loop = asyncio.get_event_loop()
|
||||
try:
|
||||
loop.run_until_complete(_allocate_workers())
|
||||
except CancelledError:
|
||||
pass
|
||||
|
||||
async def _shutdown():
|
||||
print("\nshutdown...")
|
||||
tasks = [t for t in asyncio.all_tasks()
|
||||
if t is not asyncio.current_task()]
|
||||
for task in tasks:
|
||||
task.cancel()
|
||||
try:
|
||||
await task
|
||||
except asyncio.CancelledError:
|
||||
pass
|
||||
|
||||
def cancel():
|
||||
loop = asyncio.get_event_loop()
|
||||
loop.create_task(_shutdown())
|
||||
|
||||
@@ -1,62 +0,0 @@
|
||||
from . import parser
|
||||
class Block:
|
||||
"""Block object represents something like a box
|
||||
to join chunk of chatdata.
|
||||
|
||||
Parameter:
|
||||
---------
|
||||
first : int :
|
||||
videoOffsetTimeMs of the first chat_data
|
||||
(chat_data[0])
|
||||
|
||||
last : int :
|
||||
videoOffsetTimeMs of the last chat_data.
|
||||
(chat_data[-1])
|
||||
|
||||
this value increases as fetching chatdata progresses.
|
||||
|
||||
end : int :
|
||||
target videoOffsetTimeMs of last chat data for extract,
|
||||
equals to first videoOffsetTimeMs of next block.
|
||||
when extract worker reaches this offset, stop fetching.
|
||||
|
||||
continuation : str :
|
||||
continuation param of last chat data.
|
||||
|
||||
chat_data : list
|
||||
|
||||
done : bool :
|
||||
whether this block has been fetched.
|
||||
|
||||
remaining : int :
|
||||
remaining data to extract.
|
||||
equals end - last.
|
||||
|
||||
is_last : bool :
|
||||
whether this block is the last one in blocklist.
|
||||
|
||||
during_split : bool :
|
||||
whether this block is in the process of during_split.
|
||||
while True, this block is excluded from duplicate split procedure.
|
||||
|
||||
seektime : float :
|
||||
the last position of this block(seconds) already fetched.
|
||||
"""
|
||||
|
||||
__slots__ = ['first','last','end','continuation','chat_data','remaining',
|
||||
'done','is_last','during_split','seektime']
|
||||
|
||||
def __init__(self, first = 0, last = 0, end = 0,
|
||||
continuation = '', chat_data = [], is_last = False,
|
||||
during_split = False, seektime = None):
|
||||
self.first = first
|
||||
self.last = last
|
||||
self.end = end
|
||||
self.continuation = continuation
|
||||
self.chat_data = chat_data
|
||||
self.done = False
|
||||
self.remaining = self.end - self.last
|
||||
self.is_last = is_last
|
||||
self.during_split = during_split
|
||||
self.seektime = seektime
|
||||
|
||||
@@ -1,73 +0,0 @@
|
||||
import re
|
||||
from ... import config
|
||||
from ... exceptions import (
|
||||
ResponseContextError,
|
||||
NoContents, NoContinuation)
|
||||
|
||||
logger = config.logger(__name__)
|
||||
|
||||
|
||||
def parse(jsn):
|
||||
"""
|
||||
Parse replay chat data.
|
||||
Parameter:
|
||||
----------
|
||||
jsn : dict
|
||||
JSON of replay chat data.
|
||||
Returns:
|
||||
------
|
||||
continuation : str
|
||||
actions : list
|
||||
|
||||
"""
|
||||
if jsn is None:
|
||||
raise ValueError("parameter JSON is None")
|
||||
if jsn['response']['responseContext'].get('errors'):
|
||||
raise ResponseContextError(
|
||||
'video_id is invalid or private/deleted.')
|
||||
contents = jsn["response"].get('continuationContents')
|
||||
if contents is None:
|
||||
raise NoContents('No chat data.')
|
||||
|
||||
cont = contents['liveChatContinuation']['continuations'][0]
|
||||
if cont is None:
|
||||
raise NoContinuation('No Continuation')
|
||||
metadata = cont.get('liveChatReplayContinuationData')
|
||||
if metadata:
|
||||
continuation = metadata.get("continuation")
|
||||
actions = contents['liveChatContinuation'].get('actions')
|
||||
if continuation:
|
||||
return continuation, [action["replayChatItemAction"]["actions"][0]
|
||||
for action in actions
|
||||
if list(action['replayChatItemAction']["actions"][0].values()
|
||||
)[0]['item'].get("liveChatPaidMessageRenderer")
|
||||
or list(action['replayChatItemAction']["actions"][0].values()
|
||||
)[0]['item'].get("liveChatPaidStickerRenderer")
|
||||
]
|
||||
return None, []
|
||||
|
||||
|
||||
def get_offset(item):
|
||||
return int(item['replayChatItemAction']["videoOffsetTimeMsec"])
|
||||
|
||||
|
||||
def get_id(item):
|
||||
return list((list(item['replayChatItemAction']["actions"][0].values()
|
||||
)[0])['item'].values())[0].get('id')
|
||||
|
||||
|
||||
def get_type(item):
|
||||
return list((list(item['replayChatItemAction']["actions"][0].values()
|
||||
)[0])['item'].keys())[0]
|
||||
|
||||
|
||||
_REGEX_YTINIT = re.compile(
|
||||
"window\\[\"ytInitialData\"\\]\\s*=\\s*({.+?});\\s+")
|
||||
|
||||
|
||||
def extract(text):
|
||||
|
||||
match = re.findall(_REGEX_YTINIT, str(text))
|
||||
if match:
|
||||
return match[0]
|
||||
return None
|
||||
@@ -1,27 +0,0 @@
|
||||
from . import parser
|
||||
from . block import Block
|
||||
from typing import NamedTuple
|
||||
|
||||
class Patch(NamedTuple):
|
||||
"""
|
||||
Patch represents chunk of chat data
|
||||
which is fetched by asyncdl.fetch_patch._fetch().
|
||||
"""
|
||||
chats : list = []
|
||||
continuation : str = None
|
||||
seektime : float = None
|
||||
first : int = None
|
||||
last : int = None
|
||||
|
||||
def fill(block:Block, patch:Patch):
|
||||
if patch.last < block.end:
|
||||
set_patch(block, patch)
|
||||
return
|
||||
block.continuation = None
|
||||
|
||||
def set_patch(block:Block, patch:Patch):
|
||||
block.continuation = patch.continuation
|
||||
block.chat_data.extend(patch.chats)
|
||||
block.last = patch.seektime
|
||||
block.seektime = patch.seektime
|
||||
|
||||
@@ -1,72 +0,0 @@
|
||||
from . import asyncdl
|
||||
from . import parser
|
||||
from .. videoinfo import VideoInfo
|
||||
from ... import config
|
||||
from ... exceptions import InvalidVideoIdException
|
||||
logger = config.logger(__name__)
|
||||
headers=config.headers
|
||||
|
||||
class SuperChatMiner:
|
||||
def __init__(self, video_id, duration, div, callback):
|
||||
if not isinstance(div ,int) or div < 1:
|
||||
raise ValueError('div must be positive integer.')
|
||||
elif div > 10:
|
||||
div = 10
|
||||
if not isinstance(duration ,int) or duration < 1:
|
||||
raise ValueError('duration must be positive integer.')
|
||||
self.video_id = video_id
|
||||
self.duration = duration
|
||||
self.div = div
|
||||
self.callback = callback
|
||||
self.blocks = []
|
||||
|
||||
def _ready_blocks(self):
|
||||
blocks = asyncdl.ready_blocks(
|
||||
self.video_id, self.duration, self.div, self.callback)
|
||||
self.blocks = [block for block in blocks if block is not None]
|
||||
return self
|
||||
|
||||
def _set_block_end(self):
|
||||
for i in range(len(self.blocks)-1):
|
||||
self.blocks[i].end = self.blocks[i+1].first
|
||||
self.blocks[-1].end = self.duration
|
||||
self.blocks[-1].is_last =True
|
||||
return self
|
||||
|
||||
def _download_blocks(self):
|
||||
asyncdl.fetch_patch(self.callback, self.blocks, self.video_id)
|
||||
return self
|
||||
|
||||
def _combine(self):
|
||||
ret = []
|
||||
for block in self.blocks:
|
||||
ret.extend(block.chat_data)
|
||||
return ret
|
||||
|
||||
def extract(self):
|
||||
return (
|
||||
self._ready_blocks()
|
||||
._set_block_end()
|
||||
._download_blocks()
|
||||
._combine()
|
||||
)
|
||||
|
||||
def extract(video_id, div = 1, callback = None, processor = None):
|
||||
duration = 0
|
||||
try:
|
||||
duration = VideoInfo(video_id).get_duration()
|
||||
except InvalidVideoIdException:
|
||||
raise
|
||||
if duration == 0:
|
||||
print("video is live.")
|
||||
return []
|
||||
data = SuperChatMiner(video_id, duration, div, callback).extract()
|
||||
if processor is None:
|
||||
return data
|
||||
return processor.process(
|
||||
[{'video_id':None,'timeout':1,'chatdata' : (action
|
||||
for action in data)}]
|
||||
)
|
||||
|
||||
def cancel():
|
||||
asyncdl.cancel()
|
||||
@@ -1,45 +0,0 @@
|
||||
from . import parser
|
||||
from . block import Block
|
||||
from . patch import Patch, fill
|
||||
from ... paramgen import arcparam
|
||||
INTERVAL = 1
|
||||
class ExtractWorker:
|
||||
"""
|
||||
ExtractWorker associates a download session with a block.
|
||||
|
||||
When the worker finishes fetching, the block
|
||||
being fetched is splitted and assigned the free worker.
|
||||
|
||||
Parameter
|
||||
----------
|
||||
fetch : func :
|
||||
extract function of asyncdl
|
||||
|
||||
block : Block :
|
||||
Block object that includes chat_data
|
||||
|
||||
blocks : list :
|
||||
List of Block(s)
|
||||
|
||||
video_id : str :
|
||||
|
||||
parent_block : Block :
|
||||
the block from which current block is splitted
|
||||
"""
|
||||
__slots__ = ['block', 'fetch', 'blocks', 'video_id', 'parent_block']
|
||||
def __init__(self, fetch, block, blocks, video_id ):
|
||||
self.block:Block = block
|
||||
self.fetch = fetch
|
||||
self.blocks:list = blocks
|
||||
self.video_id:str = video_id
|
||||
self.parent_block:Block = None
|
||||
|
||||
async def run(self, session):
|
||||
while self.block.continuation:
|
||||
patch = await self.fetch(
|
||||
self.block.seektime, session)
|
||||
fill(self.block, patch)
|
||||
self.block.seektime += INTERVAL
|
||||
self.block.done = True
|
||||
|
||||
|
||||
@@ -1,13 +1,15 @@
|
||||
import httpx
|
||||
import json
|
||||
import re
|
||||
import requests
|
||||
import time
|
||||
from .. import config
|
||||
from ..exceptions import InvalidVideoIdException
|
||||
from ..exceptions import InvalidVideoIdException, PatternUnmatchError, UnknownConnectionError
|
||||
from ..util.extract_video_id import extract_video_id
|
||||
|
||||
headers = config.headers
|
||||
|
||||
pattern = re.compile(r"'PLAYER_CONFIG': ({.*}}})")
|
||||
headers = config.headers
|
||||
pattern = re.compile(r"['\"]PLAYER_CONFIG['\"]:\s*({.*})")
|
||||
pattern2 = re.compile(r"yt\.setConfig\((\{[\s\S]*?\})\);")
|
||||
|
||||
item_channel_id = [
|
||||
"videoDetails",
|
||||
@@ -29,6 +31,10 @@ item_response = [
|
||||
"embedded_player_response"
|
||||
]
|
||||
|
||||
item_response2 = [
|
||||
"PLAYER_VARS",
|
||||
"embedded_player_response"
|
||||
]
|
||||
item_author_image = [
|
||||
"videoDetails",
|
||||
"embeddedPlayerOverlayVideoDetailsRenderer",
|
||||
@@ -80,21 +86,61 @@ class VideoInfo:
|
||||
|
||||
def __init__(self, video_id):
|
||||
self.video_id = extract_video_id(video_id)
|
||||
text = self._get_page_text(self.video_id)
|
||||
self._parse(text)
|
||||
self.client = httpx.Client(http2=True)
|
||||
self.new_pattern_text = False
|
||||
err = None
|
||||
for _ in range(3):
|
||||
try:
|
||||
text = self._get_page_text(self.video_id)
|
||||
self._parse(text)
|
||||
break
|
||||
except (InvalidVideoIdException, UnknownConnectionError) as e:
|
||||
raise e
|
||||
except Exception as e:
|
||||
err = e
|
||||
time.sleep(2)
|
||||
pass
|
||||
else:
|
||||
raise err
|
||||
|
||||
def _get_page_text(self, video_id):
|
||||
url = f"https://www.youtube.com/embed/{video_id}"
|
||||
resp = requests.get(url, headers=headers)
|
||||
resp.raise_for_status()
|
||||
err = None
|
||||
for _ in range(3):
|
||||
try:
|
||||
resp = self.client.get(url, headers=headers)
|
||||
resp.raise_for_status()
|
||||
break
|
||||
except httpx.HTTPError as e:
|
||||
err = e
|
||||
time.sleep(3)
|
||||
else:
|
||||
raise UnknownConnectionError(str(err))
|
||||
|
||||
return resp.text
|
||||
|
||||
def _parse(self, text):
|
||||
result = re.search(pattern, text)
|
||||
res = json.loads(result.group(1)[:-1])
|
||||
response = self._get_item(res, item_response)
|
||||
if result is None:
|
||||
result = re.search(pattern2, text)
|
||||
if result is None:
|
||||
raise PatternUnmatchError(doc=text)
|
||||
else:
|
||||
self.new_pattern_text = True
|
||||
decoder = json.JSONDecoder()
|
||||
if self.new_pattern_text:
|
||||
res = decoder.raw_decode(result.group(1))[0]
|
||||
else:
|
||||
res = decoder.raw_decode(result.group(1)[:-1])[0]
|
||||
if self.new_pattern_text:
|
||||
response = self._get_item(res, item_response2)
|
||||
else:
|
||||
response = self._get_item(res, item_response)
|
||||
if response is None:
|
||||
self._check_video_is_private(res.get("args"))
|
||||
if self.new_pattern_text:
|
||||
self._check_video_is_private(res.get("PLAYER_VARS"))
|
||||
else:
|
||||
self._check_video_is_private(res.get("args"))
|
||||
self._renderer = self._get_item(json.loads(response), item_renderer)
|
||||
if self._renderer is None:
|
||||
raise InvalidVideoIdException(
|
||||
|
||||
@@ -1,18 +1,41 @@
|
||||
import requests
|
||||
import json
|
||||
import datetime
|
||||
import httpx
|
||||
import json
|
||||
import os
|
||||
import re
|
||||
from .. import config
|
||||
|
||||
PATTERN = re.compile(r"(.*)\(([0-9]+)\)$")
|
||||
|
||||
|
||||
def extract(url):
|
||||
_session = requests.Session()
|
||||
_session = httpx.Client(http2=True)
|
||||
html = _session.get(url, headers=config.headers)
|
||||
with open(str(datetime.datetime.now().strftime('%Y-%m-%d %H-%M-%S')
|
||||
) + 'test.json', mode='w', encoding='utf-8') as f:
|
||||
json.dump(html.json(), f, ensure_ascii=False)
|
||||
|
||||
|
||||
def save(data, filename, extention):
|
||||
with open(filename + "_" + (datetime.datetime.now().strftime('%Y-%m-%d %H-%M-%S')) + extention,
|
||||
mode='w', encoding='utf-8') as f:
|
||||
def save(data, filename, extention) -> str:
|
||||
save_filename = filename + "_" + (datetime.datetime.now().strftime('%Y-%m-%d %H-%M-%S')) + extention
|
||||
with open(save_filename ,mode='w', encoding='utf-8') as f:
|
||||
f.writelines(data)
|
||||
return save_filename
|
||||
|
||||
|
||||
def checkpath(filepath):
|
||||
splitter = os.path.splitext(os.path.basename(filepath))
|
||||
body = splitter[0]
|
||||
extention = splitter[1]
|
||||
newpath = filepath
|
||||
counter = 1
|
||||
while os.path.exists(newpath):
|
||||
match = re.search(PATTERN, body)
|
||||
if match:
|
||||
counter = int(match[2]) + 1
|
||||
num_with_bracket = f'({str(counter)})'
|
||||
body = f'{match[1]}{num_with_bracket}'
|
||||
else:
|
||||
body = f'{body}({str(counter)})'
|
||||
newpath = os.path.join(os.path.dirname(filepath), body + extention)
|
||||
return newpath
|
||||
|
||||
@@ -8,18 +8,21 @@ YT_VIDEO_ID_LENGTH = 11
|
||||
|
||||
def extract_video_id(url_or_id: str) -> str:
|
||||
ret = ''
|
||||
if '[' in url_or_id:
|
||||
url_or_id = url_or_id.replace('[', '').replace(']', '')
|
||||
|
||||
if type(url_or_id) != str:
|
||||
raise TypeError(f"{url_or_id}: URL or VideoID must be str, but {type(url_or_id)} is passed.")
|
||||
if len(url_or_id) == YT_VIDEO_ID_LENGTH:
|
||||
return url_or_id
|
||||
match = re.search(PATTERN, url_or_id)
|
||||
if match is None:
|
||||
raise InvalidVideoIdException(url_or_id)
|
||||
raise InvalidVideoIdException(f"Invalid video id: {url_or_id}")
|
||||
try:
|
||||
ret = match.group(4)
|
||||
except IndexError:
|
||||
raise InvalidVideoIdException(url_or_id)
|
||||
raise InvalidVideoIdException(f"Invalid video id: {url_or_id}")
|
||||
|
||||
if ret is None or len(ret) != YT_VIDEO_ID_LENGTH:
|
||||
raise InvalidVideoIdException(url_or_id)
|
||||
raise InvalidVideoIdException(f"Invalid video id: {url_or_id}")
|
||||
return ret
|
||||
|
||||
@@ -1,5 +1,4 @@
|
||||
aiohttp
|
||||
protobuf
|
||||
httpx[http2]==0.16.1
|
||||
protobuf==3.14.0
|
||||
pytz
|
||||
requests
|
||||
urllib3
|
||||
@@ -1,5 +1,2 @@
|
||||
aioresponses
|
||||
mock
|
||||
mocker
|
||||
pytest
|
||||
pytest-mock
|
||||
pytest-mock==3.3.1
|
||||
pytest-httpx==0.10.0
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
import json
|
||||
import requests
|
||||
import httpx
|
||||
import pytchat.config as config
|
||||
from pytchat.paramgen import arcparam
|
||||
from pytchat.parser.live import Parser
|
||||
@@ -18,14 +18,15 @@ def test_arcparam_1(mocker):
|
||||
def test_arcparam_2(mocker):
|
||||
param = arcparam.getparam("SsjCnHOk-Sk", seektime=100)
|
||||
url = f"https://www.youtube.com/live_chat_replay/get_live_chat_replay?continuation={param}&pbj=1"
|
||||
resp = requests.Session().get(url, headers=config.headers)
|
||||
resp = httpx.Client(http2=True).get(url, headers=config.headers)
|
||||
jsn = json.loads(resp.text)
|
||||
parser = Parser(is_replay=True)
|
||||
contents = parser.get_contents(jsn)
|
||||
_ , chatdata = parser.parse(contents)
|
||||
_, chatdata = parser.parse(contents)
|
||||
test_id = chatdata[0]["addChatItemAction"]["item"]["liveChatTextMessageRenderer"]["id"]
|
||||
assert test_id == "CjoKGkNMYXBzZTdudHVVQ0Zjc0IxZ0FkTnFnQjVREhxDSnlBNHV2bnR1VUNGV0dnd2dvZDd3NE5aZy0w"
|
||||
|
||||
|
||||
def test_arcparam_3(mocker):
|
||||
param = arcparam.getparam("01234567890")
|
||||
assert param == "op2w0wQmGhxDZzhLRFFvTE1ERXlNelExTmpjNE9UQWdBUT09SARgAXICCAE%3D"
|
||||
|
||||
@@ -1,41 +0,0 @@
|
||||
from pytchat.tool.mining import parser
|
||||
import pytchat.config as config
|
||||
import requests
|
||||
import json
|
||||
from pytchat.paramgen import arcparam_mining as arcparam
|
||||
|
||||
|
||||
def test_arcparam_e(mocker):
|
||||
try:
|
||||
arcparam.getparam("01234567890", -1)
|
||||
assert False
|
||||
except ValueError:
|
||||
assert True
|
||||
|
||||
|
||||
def test_arcparam_0(mocker):
|
||||
param = arcparam.getparam("01234567890", 0)
|
||||
|
||||
assert param == "op2w0wQsGiBDZzhhRFFvTE1ERXlNelExTmpjNE9UQWdBUSUzRCUzREABYARyAggBeAE%3D"
|
||||
|
||||
|
||||
def test_arcparam_1(mocker):
|
||||
param = arcparam.getparam("01234567890", seektime=100000)
|
||||
print(param)
|
||||
assert param == "op2w0wQzGiBDZzhhRFFvTE1ERXlNelExTmpjNE9UQWdBUSUzRCUzREABWgUQgMLXL2AEcgIIAXgB"
|
||||
|
||||
|
||||
def test_arcparam_2(mocker):
|
||||
param = arcparam.getparam("PZz9NB0-Z64", 1)
|
||||
url = f"https://www.youtube.com/live_chat_replay?continuation={param}&playerOffsetMs=1000&pbj=1"
|
||||
resp = requests.Session().get(url, headers=config.headers)
|
||||
jsn = json.loads(resp.text)
|
||||
_, chatdata = parser.parse(jsn[1])
|
||||
test_id = chatdata[0]["addChatItemAction"]["item"]["liveChatPaidMessageRenderer"]["id"]
|
||||
print(test_id)
|
||||
assert test_id == "ChwKGkNKSGE0YnFJeWVBQ0ZWcUF3Z0VkdGIwRm9R"
|
||||
|
||||
|
||||
def test_arcparam_3(mocker):
|
||||
param = arcparam.getparam("01234567890")
|
||||
assert param == "op2w0wQsGiBDZzhhRFFvTE1ERXlNelExTmpjNE9UQWdBUSUzRCUzREABYARyAggBeAE%3D"
|
||||
@@ -1,8 +1,17 @@
|
||||
import json
|
||||
from datetime import datetime
|
||||
from pytchat.parser.live import Parser
|
||||
from pytchat.processors.default.processor import DefaultProcessor
|
||||
|
||||
|
||||
TEST_TIMETSTAMP = 1570678496000000
|
||||
|
||||
|
||||
def get_local_datetime(timestamp):
|
||||
dt = datetime.fromtimestamp(timestamp / 1000000)
|
||||
return dt.strftime('%Y-%m-%d %H:%M:%S')
|
||||
|
||||
|
||||
def test_textmessage(mocker):
|
||||
'''text message'''
|
||||
processor = DefaultProcessor()
|
||||
@@ -17,11 +26,10 @@ def test_textmessage(mocker):
|
||||
}
|
||||
|
||||
ret = processor.process([data]).items[0]
|
||||
assert ret.chattype == "textMessage"
|
||||
assert ret.id == "dummy_id"
|
||||
assert ret.message == "dummy_message"
|
||||
assert ret.timestamp == 1570678496000
|
||||
assert ret.datetime == "2019-10-10 12:34:56"
|
||||
assert ret.datetime == get_local_datetime(TEST_TIMETSTAMP)
|
||||
assert ret.author.name == "author_name"
|
||||
assert ret.author.channelId == "author_channel_id"
|
||||
assert ret.author.channelUrl == "http://www.youtube.com/channel/author_channel_id"
|
||||
@@ -47,13 +55,12 @@ def test_textmessage_replay_member(mocker):
|
||||
}
|
||||
|
||||
ret = processor.process([data]).items[0]
|
||||
assert ret.chattype == "textMessage"
|
||||
assert ret.type == "textMessage"
|
||||
assert ret.id == "dummy_id"
|
||||
assert ret.message == "dummy_message"
|
||||
assert ret.messageEx == ["dummy_message"]
|
||||
assert ret.timestamp == 1570678496000
|
||||
assert ret.datetime == "2019-10-10 12:34:56"
|
||||
assert ret.datetime == get_local_datetime(TEST_TIMETSTAMP)
|
||||
assert ret.elapsedTime == "1:23:45"
|
||||
assert ret.author.name == "author_name"
|
||||
assert ret.author.channelId == "author_channel_id"
|
||||
@@ -80,14 +87,12 @@ def test_superchat(mocker):
|
||||
}
|
||||
|
||||
ret = processor.process([data]).items[0]
|
||||
print(json.dumps(chatdata, ensure_ascii=False))
|
||||
assert ret.chattype == "superChat"
|
||||
assert ret.type == "superChat"
|
||||
assert ret.id == "dummy_id"
|
||||
assert ret.message == "dummy_message"
|
||||
assert ret.messageEx == ["dummy_message"]
|
||||
assert ret.timestamp == 1570678496000
|
||||
assert ret.datetime == "2019-10-10 12:34:56"
|
||||
assert ret.datetime == get_local_datetime(TEST_TIMETSTAMP)
|
||||
assert ret.elapsedTime == ""
|
||||
assert ret.amountValue == 800
|
||||
assert ret.amountString == "¥800"
|
||||
@@ -124,14 +129,12 @@ def test_supersticker(mocker):
|
||||
}
|
||||
|
||||
ret = processor.process([data]).items[0]
|
||||
print(json.dumps(chatdata, ensure_ascii=False))
|
||||
assert ret.chattype == "superSticker"
|
||||
assert ret.type == "superSticker"
|
||||
assert ret.id == "dummy_id"
|
||||
assert ret.message == ""
|
||||
assert ret.messageEx == []
|
||||
assert ret.timestamp == 1570678496000
|
||||
assert ret.datetime == "2019-10-10 12:34:56"
|
||||
assert ret.datetime == get_local_datetime(TEST_TIMETSTAMP)
|
||||
assert ret.elapsedTime == ""
|
||||
assert ret.amountValue == 200
|
||||
assert ret.amountString == "¥200"
|
||||
@@ -167,14 +170,12 @@ def test_sponsor(mocker):
|
||||
}
|
||||
|
||||
ret = processor.process([data]).items[0]
|
||||
print(json.dumps(chatdata, ensure_ascii=False))
|
||||
assert ret.chattype == "newSponsor"
|
||||
assert ret.type == "newSponsor"
|
||||
assert ret.id == "dummy_id"
|
||||
assert ret.message == "新規メンバー"
|
||||
assert ret.messageEx == ["新規メンバー"]
|
||||
assert ret.timestamp == 1570678496000
|
||||
assert ret.datetime == "2019-10-10 12:34:56"
|
||||
assert ret.datetime == get_local_datetime(TEST_TIMETSTAMP)
|
||||
assert ret.elapsedTime == ""
|
||||
assert ret.bgColor == 0
|
||||
assert ret.author.name == "author_name"
|
||||
@@ -202,14 +203,12 @@ def test_sponsor_legacy(mocker):
|
||||
}
|
||||
|
||||
ret = processor.process([data]).items[0]
|
||||
print(json.dumps(chatdata, ensure_ascii=False))
|
||||
assert ret.chattype == "newSponsor"
|
||||
assert ret.type == "newSponsor"
|
||||
assert ret.id == "dummy_id"
|
||||
assert ret.message == "新規メンバー / ようこそ、author_name!"
|
||||
assert ret.messageEx == ["新規メンバー / ようこそ、author_name!"]
|
||||
assert ret.timestamp == 1570678496000
|
||||
assert ret.datetime == "2019-10-10 12:34:56"
|
||||
assert ret.datetime == get_local_datetime(TEST_TIMETSTAMP)
|
||||
assert ret.elapsedTime == ""
|
||||
assert ret.bgColor == 0
|
||||
assert ret.author.name == "author_name"
|
||||
|
||||
@@ -1,77 +0,0 @@
|
||||
import aiohttp
|
||||
import asyncio
|
||||
import json
|
||||
from pytchat.tool.extract import parser
|
||||
import sys
|
||||
import time
|
||||
from aioresponses import aioresponses
|
||||
from concurrent.futures import CancelledError
|
||||
from pytchat.tool.extract import asyncdl
|
||||
|
||||
def _open_file(path):
|
||||
with open(path,mode ='r',encoding = 'utf-8') as f:
|
||||
return f.read()
|
||||
|
||||
|
||||
def test_asyncdl_split():
|
||||
|
||||
ret = asyncdl._split(0,1000,1)
|
||||
assert ret == [0]
|
||||
|
||||
ret = asyncdl._split(1000,1000,10)
|
||||
assert ret == [1000]
|
||||
|
||||
ret = asyncdl._split(0,1000,5)
|
||||
assert ret == [0,200,400,600,800]
|
||||
|
||||
ret = asyncdl._split(10.5, 700.3, 5)
|
||||
assert ret == [10, 148, 286, 424, 562]
|
||||
|
||||
|
||||
ret = asyncdl._split(0,500,5)
|
||||
assert ret == [0,125,250,375]
|
||||
|
||||
ret = asyncdl._split(0,500,500)
|
||||
assert ret == [0,125,250,375]
|
||||
|
||||
ret = asyncdl._split(-1,1000,5)
|
||||
assert ret == [-1, 199, 399, 599, 799]
|
||||
|
||||
"""invalid argument order"""
|
||||
try:
|
||||
ret = asyncdl._split(500,0,5)
|
||||
assert False
|
||||
except ValueError:
|
||||
assert True
|
||||
|
||||
"""invalid count"""
|
||||
try:
|
||||
ret = asyncdl._split(0,500,-1)
|
||||
assert False
|
||||
except ValueError:
|
||||
assert True
|
||||
|
||||
try:
|
||||
ret = asyncdl._split(0,500,0)
|
||||
assert False
|
||||
except ValueError:
|
||||
assert True
|
||||
|
||||
"""invalid argument type"""
|
||||
try:
|
||||
ret = asyncdl._split(0,5000,5.2)
|
||||
assert False
|
||||
except ValueError:
|
||||
assert True
|
||||
|
||||
try:
|
||||
ret = asyncdl._split(0,5000,"test")
|
||||
assert False
|
||||
except ValueError:
|
||||
assert True
|
||||
|
||||
try:
|
||||
ret = asyncdl._split([0,1],5000,5)
|
||||
assert False
|
||||
except ValueError:
|
||||
assert True
|
||||
@@ -1,60 +1,66 @@
|
||||
import aiohttp
|
||||
import asyncio
|
||||
import json
|
||||
import os, sys
|
||||
import time
|
||||
from pytchat.tool.extract import duplcheck
|
||||
from pytchat.tool.extract import parser
|
||||
from pytchat.tool.extract.block import Block
|
||||
from pytchat.tool.extract.duplcheck import _dump
|
||||
def _open_file(path):
|
||||
with open(path,mode ='r',encoding = 'utf-8') as f:
|
||||
return f.read()
|
||||
|
||||
|
||||
def _open_file(path):
|
||||
with open(path, mode='r', encoding='utf-8') as f:
|
||||
return f.read()
|
||||
|
||||
|
||||
def test_overlap():
|
||||
"""
|
||||
test overlap data
|
||||
test overlap data
|
||||
operation : [0] [2] [3] [4] -> last :align to end
|
||||
[1] , [5] -> no change
|
||||
|
||||
|
||||
"""
|
||||
|
||||
def load_chatdata(filename):
|
||||
return parser.parse(
|
||||
json.loads(_open_file("tests/testdata/extract_duplcheck/overlap/"+filename))
|
||||
json.loads(_open_file(
|
||||
"tests/testdata/extract_duplcheck/overlap/" + filename))
|
||||
)[1]
|
||||
|
||||
blocks = (
|
||||
Block(first = 0, last= 12771, end= 9890,chat_data = load_chatdata("dp0-0.json")),
|
||||
Block(first = 9890, last= 15800, end= 20244,chat_data = load_chatdata("dp0-1.json")),
|
||||
Block(first = 20244,last= 45146, end= 32476,chat_data = load_chatdata("dp0-2.json")),
|
||||
Block(first = 32476,last= 50520, end= 41380,chat_data = load_chatdata("dp0-3.json")),
|
||||
Block(first = 41380,last= 62875, end= 52568,chat_data = load_chatdata("dp0-4.json")),
|
||||
Block(first = 52568,last= 62875, end= 54000,chat_data = load_chatdata("dp0-5.json"),is_last=True)
|
||||
Block(first=0, last=12771, end=9890,
|
||||
chat_data=load_chatdata("dp0-0.json")),
|
||||
Block(first=9890, last=15800, end=20244,
|
||||
chat_data=load_chatdata("dp0-1.json")),
|
||||
Block(first=20244, last=45146, end=32476,
|
||||
chat_data=load_chatdata("dp0-2.json")),
|
||||
Block(first=32476, last=50520, end=41380,
|
||||
chat_data=load_chatdata("dp0-3.json")),
|
||||
Block(first=41380, last=62875, end=52568,
|
||||
chat_data=load_chatdata("dp0-4.json")),
|
||||
Block(first=52568, last=62875, end=54000,
|
||||
chat_data=load_chatdata("dp0-5.json"), is_last=True)
|
||||
)
|
||||
result = duplcheck.remove_overlap(blocks)
|
||||
#dp0-0.json has item offset time is 9890 (equals block[0].end = block[1].first),
|
||||
#but must be aligne to the most close and smaller value:9779.
|
||||
# dp0-0.json has item offset time is 9890 (equals block[0].end = block[1].first),
|
||||
# but must be aligne to the most close and smaller value:9779.
|
||||
assert result[0].last == 9779
|
||||
|
||||
|
||||
assert result[1].last == 15800
|
||||
|
||||
|
||||
assert result[2].last == 32196
|
||||
|
||||
|
||||
assert result[3].last == 41116
|
||||
|
||||
|
||||
assert result[4].last == 52384
|
||||
|
||||
#the last block must be always added to result.
|
||||
|
||||
# the last block must be always added to result.
|
||||
assert result[5].last == 62875
|
||||
|
||||
|
||||
|
||||
def test_duplicate_head():
|
||||
|
||||
def load_chatdata(filename):
|
||||
return parser.parse(
|
||||
json.loads(_open_file("tests/testdata/extract_duplcheck/head/"+filename))
|
||||
json.loads(_open_file(
|
||||
"tests/testdata/extract_duplcheck/head/" + filename))
|
||||
)[1]
|
||||
|
||||
"""
|
||||
@@ -69,25 +75,26 @@ def test_duplicate_head():
|
||||
result : [2] , [4] , [5]
|
||||
"""
|
||||
|
||||
#chat data offsets are ignored.
|
||||
# chat data offsets are ignored.
|
||||
blocks = (
|
||||
Block(first = 0, last = 2500, chat_data = load_chatdata("dp0-0.json")),
|
||||
Block(first = 0, last =38771, chat_data = load_chatdata("dp0-1.json")),
|
||||
Block(first = 0, last =45146, chat_data = load_chatdata("dp0-2.json")),
|
||||
Block(first = 20244, last =60520, chat_data = load_chatdata("dp0-3.json")),
|
||||
Block(first = 20244, last =62875, chat_data = load_chatdata("dp0-4.json")),
|
||||
Block(first = 52568, last =62875, chat_data = load_chatdata("dp0-5.json"))
|
||||
Block(first=0, last=2500, chat_data=load_chatdata("dp0-0.json")),
|
||||
Block(first=0, last=38771, chat_data=load_chatdata("dp0-1.json")),
|
||||
Block(first=0, last=45146, chat_data=load_chatdata("dp0-2.json")),
|
||||
Block(first=20244, last=60520, chat_data=load_chatdata("dp0-3.json")),
|
||||
Block(first=20244, last=62875, chat_data=load_chatdata("dp0-4.json")),
|
||||
Block(first=52568, last=62875, chat_data=load_chatdata("dp0-5.json"))
|
||||
)
|
||||
_dump(blocks)
|
||||
result = duplcheck.remove_duplicate_head(blocks)
|
||||
|
||||
|
||||
assert len(result) == 3
|
||||
assert result[0].first == blocks[2].first
|
||||
assert result[0].last == blocks[2].last
|
||||
assert result[0].last == blocks[2].last
|
||||
assert result[1].first == blocks[4].first
|
||||
assert result[1].last == blocks[4].last
|
||||
assert result[1].last == blocks[4].last
|
||||
assert result[2].first == blocks[5].first
|
||||
assert result[2].last == blocks[5].last
|
||||
assert result[2].last == blocks[5].last
|
||||
|
||||
|
||||
def test_duplicate_tail():
|
||||
"""
|
||||
@@ -103,26 +110,25 @@ def test_duplicate_tail():
|
||||
"""
|
||||
def load_chatdata(filename):
|
||||
return parser.parse(
|
||||
json.loads(_open_file("tests/testdata/extract_duplcheck/head/"+filename))
|
||||
json.loads(_open_file(
|
||||
"tests/testdata/extract_duplcheck/head/" + filename))
|
||||
)[1]
|
||||
#chat data offsets are ignored.
|
||||
# chat data offsets are ignored.
|
||||
blocks = (
|
||||
Block(first = 0,last = 2500, chat_data=load_chatdata("dp0-0.json")),
|
||||
Block(first = 1500,last = 2500, chat_data=load_chatdata("dp0-1.json")),
|
||||
Block(first = 10000,last = 45146, chat_data=load_chatdata("dp0-2.json")),
|
||||
Block(first = 20244,last = 45146, chat_data=load_chatdata("dp0-3.json")),
|
||||
Block(first = 20244,last = 62875, chat_data=load_chatdata("dp0-4.json")),
|
||||
Block(first = 52568,last = 62875, chat_data=load_chatdata("dp0-5.json"))
|
||||
Block(first=0, last=2500, chat_data=load_chatdata("dp0-0.json")),
|
||||
Block(first=1500, last=2500, chat_data=load_chatdata("dp0-1.json")),
|
||||
Block(first=10000, last=45146, chat_data=load_chatdata("dp0-2.json")),
|
||||
Block(first=20244, last=45146, chat_data=load_chatdata("dp0-3.json")),
|
||||
Block(first=20244, last=62875, chat_data=load_chatdata("dp0-4.json")),
|
||||
Block(first=52568, last=62875, chat_data=load_chatdata("dp0-5.json"))
|
||||
)
|
||||
|
||||
result = duplcheck.remove_duplicate_tail(blocks)
|
||||
_dump(result)
|
||||
assert len(result) == 3
|
||||
assert result[0].first == blocks[0].first
|
||||
assert result[0].last == blocks[0].last
|
||||
assert result[0].last == blocks[0].last
|
||||
assert result[1].first == blocks[2].first
|
||||
assert result[1].last == blocks[2].last
|
||||
assert result[1].last == blocks[2].last
|
||||
assert result[2].first == blocks[4].first
|
||||
assert result[2].last == blocks[4].last
|
||||
|
||||
|
||||
assert result[2].last == blocks[4].last
|
||||
|
||||
@@ -1,23 +1,19 @@
|
||||
import aiohttp
|
||||
import asyncio
|
||||
import json
|
||||
import os, sys
|
||||
import time
|
||||
from aioresponses import aioresponses
|
||||
from pytchat.tool.extract import duplcheck
|
||||
|
||||
from pytchat.tool.extract import parser
|
||||
from pytchat.tool.extract.block import Block
|
||||
from pytchat.tool.extract.patch import Patch, fill, split, set_patch
|
||||
from pytchat.tool.extract.duplcheck import _dump
|
||||
from pytchat.tool.extract.patch import Patch, split
|
||||
|
||||
|
||||
def _open_file(path):
|
||||
with open(path,mode ='r',encoding = 'utf-8') as f:
|
||||
with open(path, mode='r', encoding='utf-8') as f:
|
||||
return f.read()
|
||||
|
||||
|
||||
def load_chatdata(filename):
|
||||
return parser.parse(
|
||||
json.loads(_open_file("tests/testdata/fetch_patch/"+filename))
|
||||
)[1]
|
||||
return parser.parse(
|
||||
json.loads(_open_file("tests/testdata/fetch_patch/" + filename))
|
||||
)[1]
|
||||
|
||||
|
||||
def test_split_0():
|
||||
@@ -61,20 +57,23 @@ def test_split_0():
|
||||
@fetched patch
|
||||
|-- patch --|
|
||||
"""
|
||||
parent = Block(first=0, last=4000, end=60000, continuation='parent', during_split=True)
|
||||
child = Block(first=0, last=0, end=60000, continuation='mean', during_split=True)
|
||||
parent = Block(first=0, last=4000, end=60000,
|
||||
continuation='parent', during_split=True)
|
||||
child = Block(first=0, last=0, end=60000,
|
||||
continuation='mean', during_split=True)
|
||||
patch = Patch(chats=load_chatdata('pt0-5.json'),
|
||||
first=32500, last=34000, continuation='patch')
|
||||
|
||||
split(parent,child,patch)
|
||||
first=32500, last=34000, continuation='patch')
|
||||
|
||||
split(parent, child, patch)
|
||||
|
||||
assert child.continuation == 'patch'
|
||||
assert parent.last < child.first
|
||||
assert parent.end == child.first
|
||||
assert child.first < child.last
|
||||
assert child.last < child.end
|
||||
assert parent.during_split == False
|
||||
assert child.during_split == False
|
||||
assert parent.during_split is False
|
||||
assert child.during_split is False
|
||||
|
||||
|
||||
def test_split_1():
|
||||
"""patch.first <= parent_block.last
|
||||
@@ -119,14 +118,15 @@ def test_split_1():
|
||||
child = Block(first=0, last=0, end=60000, continuation='mean', during_split=True)
|
||||
patch = Patch(chats=load_chatdata('pt0-5.json'),
|
||||
first=32500, last=34000, continuation='patch')
|
||||
|
||||
split(parent,child,patch)
|
||||
|
||||
assert parent.last == 33000 #no change
|
||||
assert parent.end == 60000 #no change
|
||||
split(parent, child, patch)
|
||||
|
||||
assert parent.last == 33000 # no change
|
||||
assert parent.end == 60000 # no change
|
||||
assert child.continuation is None
|
||||
assert parent.during_split == False
|
||||
assert child.during_split == True #exclude during_split sequence
|
||||
assert parent.during_split is False
|
||||
assert child.during_split is True # exclude during_split sequence
|
||||
|
||||
|
||||
def test_split_2():
|
||||
"""child_block.end < patch.last:
|
||||
@@ -174,7 +174,7 @@ def test_split_2():
|
||||
patch = Patch(chats=load_chatdata('pt0-5.json'),
|
||||
first=32500, last=34000, continuation='patch')
|
||||
|
||||
split(parent,child,patch)
|
||||
split(parent, child, patch)
|
||||
|
||||
assert child.continuation is None
|
||||
assert parent.last < child.first
|
||||
@@ -182,8 +182,9 @@ def test_split_2():
|
||||
assert child.first < child.last
|
||||
assert child.last < child.end
|
||||
assert child.continuation is None
|
||||
assert parent.during_split == False
|
||||
assert child.during_split == False
|
||||
assert parent.during_split is False
|
||||
assert child.during_split is False
|
||||
|
||||
|
||||
def test_split_none():
|
||||
"""patch.last <= parent_block.last
|
||||
@@ -193,7 +194,7 @@ def test_split_none():
|
||||
and parent.block.last exceeds patch.first.
|
||||
|
||||
In this case, fetched patch is all discarded,
|
||||
and worker searches other processing block again.
|
||||
and worker searches other processing block again.
|
||||
|
||||
~~~~~~ before ~~~~~~
|
||||
|
||||
@@ -229,10 +230,10 @@ def test_split_none():
|
||||
patch = Patch(chats=load_chatdata('pt0-5.json'),
|
||||
first=32500, last=34000, continuation='patch')
|
||||
|
||||
split(parent,child,patch)
|
||||
split(parent, child, patch)
|
||||
|
||||
assert parent.last == 40000 #no change
|
||||
assert parent.end == 60000 #no change
|
||||
assert parent.last == 40000 # no change
|
||||
assert parent.end == 60000 # no change
|
||||
assert child.continuation is None
|
||||
assert parent.during_split == False
|
||||
assert child.during_split == True #exclude during_split sequence
|
||||
assert parent.during_split is False
|
||||
assert child.during_split is True # exclude during_split sequence
|
||||
|
||||
@@ -1,5 +1,8 @@
|
||||
import asyncio
|
||||
import json
|
||||
from aioresponses import aioresponses
|
||||
from pytest_httpx import HTTPXMock
|
||||
from concurrent.futures import CancelledError
|
||||
from pytchat.core_multithread.livechat import LiveChat
|
||||
from pytchat.core_async.livechat import LiveChatAsync
|
||||
from pytchat.exceptions import ResponseContextError
|
||||
|
||||
@@ -9,34 +12,37 @@ def _open_file(path):
|
||||
return f.read()
|
||||
|
||||
|
||||
@aioresponses()
|
||||
def test_Async(*mock):
|
||||
vid = '__test_id__'
|
||||
_text = _open_file('tests/testdata/paramgen_firstread.json')
|
||||
_text = json.loads(_text)
|
||||
mock[0].get(
|
||||
f"https://www.youtube.com/live_chat?v={vid}&is_popout=1", status=200, body=_text)
|
||||
def add_response_file(httpx_mock: HTTPXMock, jsonfile_path: str):
|
||||
testdata = json.loads(_open_file(jsonfile_path))
|
||||
httpx_mock.add_response(json=testdata)
|
||||
|
||||
|
||||
def test_async(httpx_mock: HTTPXMock):
|
||||
add_response_file(httpx_mock, 'tests/testdata/paramgen_firstread.json')
|
||||
|
||||
async def test_loop():
|
||||
try:
|
||||
chat = LiveChatAsync(video_id='__test_id__')
|
||||
_ = await chat.get()
|
||||
assert chat.is_alive()
|
||||
chat.terminate()
|
||||
assert not chat.is_alive()
|
||||
except ResponseContextError:
|
||||
assert False
|
||||
loop = asyncio.get_event_loop()
|
||||
try:
|
||||
chat = LiveChatAsync(video_id='__test_id__')
|
||||
loop.run_until_complete(test_loop())
|
||||
except CancelledError:
|
||||
assert True
|
||||
|
||||
|
||||
def test_multithread(httpx_mock: HTTPXMock):
|
||||
add_response_file(httpx_mock, 'tests/testdata/paramgen_firstread.json')
|
||||
try:
|
||||
chat = LiveChat(video_id='__test_id__')
|
||||
_ = chat.get()
|
||||
assert chat.is_alive()
|
||||
chat.terminate()
|
||||
assert not chat.is_alive()
|
||||
except ResponseContextError:
|
||||
assert not chat.is_alive()
|
||||
|
||||
|
||||
def test_MultiThread(mocker):
|
||||
_text = _open_file('tests/testdata/paramgen_firstread.json')
|
||||
_text = json.loads(_text)
|
||||
responseMock = mocker.Mock()
|
||||
responseMock.status_code = 200
|
||||
responseMock.text = _text
|
||||
mocker.patch('requests.Session.get').return_value = responseMock
|
||||
try:
|
||||
chat = LiveChatAsync(video_id='__test_id__')
|
||||
assert chat.is_alive()
|
||||
chat.terminate()
|
||||
assert not chat.is_alive()
|
||||
except ResponseContextError:
|
||||
chat.terminate()
|
||||
assert not chat.is_alive()
|
||||
assert False
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
import asyncio
|
||||
import re
|
||||
from aioresponses import aioresponses
|
||||
import json
|
||||
from pytest_httpx import HTTPXMock
|
||||
from concurrent.futures import CancelledError
|
||||
from pytchat.core_multithread.livechat import LiveChat
|
||||
from pytchat.core_async.livechat import LiveChatAsync
|
||||
@@ -12,18 +12,18 @@ def _open_file(path):
|
||||
return f.read()
|
||||
|
||||
|
||||
@aioresponses()
|
||||
def test_async_live_stream(*mock):
|
||||
def add_response_file(httpx_mock: HTTPXMock, jsonfile_path: str):
|
||||
testdata = json.loads(_open_file(jsonfile_path))
|
||||
httpx_mock.add_response(json=testdata)
|
||||
|
||||
async def test_loop(*mock):
|
||||
pattern = re.compile(
|
||||
r'^https://www.youtube.com/live_chat/get_live_chat\?continuation=.*$')
|
||||
_text = _open_file('tests/testdata/test_stream.json')
|
||||
mock[0].get(pattern, status=200, body=_text)
|
||||
|
||||
def test_async_live_stream(httpx_mock: HTTPXMock):
|
||||
add_response_file(httpx_mock, 'tests/testdata/test_stream.json')
|
||||
|
||||
async def test_loop():
|
||||
chat = LiveChatAsync(video_id='__test_id__', processor=DummyProcessor())
|
||||
chats = await chat.get()
|
||||
rawdata = chats[0]["chatdata"]
|
||||
# assert fetching livachat data
|
||||
assert list(rawdata[0]["addChatItemAction"]["item"].keys())[
|
||||
0] == "liveChatTextMessageRenderer"
|
||||
assert list(rawdata[1]["addChatItemAction"]["item"].keys())[
|
||||
@@ -41,25 +41,16 @@ def test_async_live_stream(*mock):
|
||||
|
||||
loop = asyncio.get_event_loop()
|
||||
try:
|
||||
loop.run_until_complete(test_loop(*mock))
|
||||
loop.run_until_complete(test_loop())
|
||||
except CancelledError:
|
||||
assert True
|
||||
|
||||
|
||||
@aioresponses()
|
||||
def test_async_replay_stream(*mock):
|
||||
|
||||
async def test_loop(*mock):
|
||||
pattern_live = re.compile(
|
||||
r'^https://www.youtube.com/live_chat/get_live_chat\?continuation=.*$')
|
||||
pattern_replay = re.compile(
|
||||
r'^https://www.youtube.com/live_chat_replay/get_live_chat_replay\?continuation=.*$')
|
||||
# empty livechat -> switch to fetch replaychat
|
||||
_text_live = _open_file('tests/testdata/finished_live.json')
|
||||
_text_replay = _open_file('tests/testdata/chatreplay.json')
|
||||
mock[0].get(pattern_live, status=200, body=_text_live)
|
||||
mock[0].get(pattern_replay, status=200, body=_text_replay)
|
||||
def test_async_replay_stream(httpx_mock: HTTPXMock):
|
||||
add_response_file(httpx_mock, 'tests/testdata/finished_live.json')
|
||||
add_response_file(httpx_mock, 'tests/testdata/chatreplay.json')
|
||||
|
||||
async def test_loop():
|
||||
chat = LiveChatAsync(video_id='__test_id__', processor=DummyProcessor())
|
||||
chats = await chat.get()
|
||||
rawdata = chats[0]["chatdata"]
|
||||
@@ -71,27 +62,16 @@ def test_async_replay_stream(*mock):
|
||||
|
||||
loop = asyncio.get_event_loop()
|
||||
try:
|
||||
loop.run_until_complete(test_loop(*mock))
|
||||
loop.run_until_complete(test_loop())
|
||||
except CancelledError:
|
||||
assert True
|
||||
|
||||
|
||||
@aioresponses()
|
||||
def test_async_force_replay(*mock):
|
||||
def test_async_force_replay(httpx_mock: HTTPXMock):
|
||||
add_response_file(httpx_mock, 'tests/testdata/test_stream.json')
|
||||
add_response_file(httpx_mock, 'tests/testdata/chatreplay.json')
|
||||
|
||||
async def test_loop(*mock):
|
||||
pattern_live = re.compile(
|
||||
r'^https://www.youtube.com/live_chat/get_live_chat\?continuation=.*$')
|
||||
pattern_replay = re.compile(
|
||||
r'^https://www.youtube.com/live_chat_replay/get_live_chat_replay\?continuation=.*$')
|
||||
# valid live data, but force_replay = True
|
||||
_text_live = _open_file('tests/testdata/test_stream.json')
|
||||
# valid replay data
|
||||
_text_replay = _open_file('tests/testdata/chatreplay.json')
|
||||
|
||||
mock[0].get(pattern_live, status=200, body=_text_live)
|
||||
mock[0].get(pattern_replay, status=200, body=_text_replay)
|
||||
# force replay
|
||||
async def test_loop():
|
||||
chat = LiveChatAsync(
|
||||
video_id='__test_id__', processor=DummyProcessor(), force_replay=True)
|
||||
chats = await chat.get()
|
||||
@@ -105,20 +85,13 @@ def test_async_force_replay(*mock):
|
||||
|
||||
loop = asyncio.get_event_loop()
|
||||
try:
|
||||
loop.run_until_complete(test_loop(*mock))
|
||||
loop.run_until_complete(test_loop())
|
||||
except CancelledError:
|
||||
assert True
|
||||
|
||||
|
||||
def test_multithread_live_stream(mocker):
|
||||
|
||||
_text = _open_file('tests/testdata/test_stream.json')
|
||||
responseMock = mocker.Mock()
|
||||
responseMock.status_code = 200
|
||||
responseMock.text = _text
|
||||
mocker.patch(
|
||||
'requests.Session.get').return_value.__enter__.return_value = responseMock
|
||||
|
||||
def test_multithread_live_stream(httpx_mock: HTTPXMock):
|
||||
add_response_file(httpx_mock, 'tests/testdata/test_stream.json')
|
||||
chat = LiveChat(video_id='__test_id__', processor=DummyProcessor())
|
||||
chats = chat.get()
|
||||
rawdata = chats[0]["chatdata"]
|
||||
|
||||
@@ -1,21 +1,18 @@
|
||||
from pytchat.parser.live import Parser
|
||||
import json
|
||||
from aioresponses import aioresponses
|
||||
from pytchat.exceptions import NoContents
|
||||
|
||||
|
||||
parser = Parser(is_replay=False)
|
||||
|
||||
|
||||
def _open_file(path):
|
||||
with open(path, mode='r', encoding='utf-8') as f:
|
||||
return f.read()
|
||||
|
||||
|
||||
parser = Parser(is_replay=False)
|
||||
|
||||
|
||||
@aioresponses()
|
||||
def test_finishedlive(*mock):
|
||||
'''配信が終了した動画を正しく処理できるか'''
|
||||
|
||||
_text = _open_file('tests/testdata/finished_live.json')
|
||||
_text = json.loads(_text)
|
||||
|
||||
@@ -26,10 +23,8 @@ def test_finishedlive(*mock):
|
||||
assert True
|
||||
|
||||
|
||||
@aioresponses()
|
||||
def test_parsejson(*mock):
|
||||
'''jsonを正常にパースできるか'''
|
||||
|
||||
_text = _open_file('tests/testdata/paramgen_firstread.json')
|
||||
_text = json.loads(_text)
|
||||
|
||||
|
||||
@@ -1,3 +1,4 @@
|
||||
from json.decoder import JSONDecodeError
|
||||
from pytchat.tool.videoinfo import VideoInfo
|
||||
from pytchat.exceptions import InvalidVideoIdException
|
||||
|
||||
@@ -12,13 +13,13 @@ def _set_test_data(filepath, mocker):
|
||||
response_mock = mocker.Mock()
|
||||
response_mock.status_code = 200
|
||||
response_mock.text = _text
|
||||
mocker.patch('requests.get').return_value = response_mock
|
||||
mocker.patch('httpx.Client.get').return_value = response_mock
|
||||
|
||||
|
||||
def test_archived_page(mocker):
|
||||
_set_test_data('tests/testdata/videoinfo/archived_page.txt', mocker)
|
||||
info = VideoInfo('__test_id__')
|
||||
actual_thumbnail_url = 'https://i.ytimg.com/vi/fzI9FNjXQ0o/hqdefault.jpg'
|
||||
actual_thumbnail_url = 'https://i.ytimg.com/vi/fzI9FNjXQ0o/hqdefault.jpg'
|
||||
assert info.video_id == '__test_id__'
|
||||
assert info.get_channel_name() == 'GitHub'
|
||||
assert info.get_thumbnail() == actual_thumbnail_url
|
||||
@@ -30,7 +31,7 @@ def test_archived_page(mocker):
|
||||
def test_live_page(mocker):
|
||||
_set_test_data('tests/testdata/videoinfo/live_page.txt', mocker)
|
||||
info = VideoInfo('__test_id__')
|
||||
'''live page :duration = 0'''
|
||||
'''live page: duration==0'''
|
||||
assert info.get_duration() == 0
|
||||
assert info.video_id == '__test_id__'
|
||||
assert info.get_channel_name() == 'BGM channel'
|
||||
@@ -64,3 +65,37 @@ def test_no_info(mocker):
|
||||
assert info.get_title() is None
|
||||
assert info.get_channel_id() is None
|
||||
assert info.get_duration() is None
|
||||
|
||||
|
||||
def test_collapsed_data(mocker):
|
||||
'''Test case the video page's info is collapsed.'''
|
||||
_set_test_data(
|
||||
'tests/testdata/videoinfo/collapsed_page.txt', mocker)
|
||||
try:
|
||||
_ = VideoInfo('__test_id__')
|
||||
assert False
|
||||
except JSONDecodeError:
|
||||
assert True
|
||||
|
||||
|
||||
def test_pattern_unmatch(mocker):
|
||||
'''Test case the pattern for extraction is unmatched.'''
|
||||
_set_test_data(
|
||||
'tests/testdata/videoinfo/pattern_unmatch.txt', mocker)
|
||||
try:
|
||||
_ = VideoInfo('__test_id__')
|
||||
assert False
|
||||
except JSONDecodeError:
|
||||
assert True
|
||||
|
||||
|
||||
def test_extradata_handling(mocker):
|
||||
'''Test case the extracted data are JSON lines.'''
|
||||
_set_test_data(
|
||||
'tests/testdata/videoinfo/extradata_page.txt', mocker)
|
||||
try:
|
||||
_ = VideoInfo('__test_id__')
|
||||
assert True
|
||||
except JSONDecodeError as e:
|
||||
print(e.doc)
|
||||
assert False
|
||||
|
||||
15
tests/testdata/videoinfo/collapsed_page.txt
vendored
Normal file
15
tests/testdata/videoinfo/collapsed_page.txt
vendored
Normal file
File diff suppressed because one or more lines are too long
15
tests/testdata/videoinfo/extradata_page.txt
vendored
Normal file
15
tests/testdata/videoinfo/extradata_page.txt
vendored
Normal file
File diff suppressed because one or more lines are too long
15
tests/testdata/videoinfo/pattern_unmatch.txt
vendored
Normal file
15
tests/testdata/videoinfo/pattern_unmatch.txt
vendored
Normal file
File diff suppressed because one or more lines are too long
Reference in New Issue
Block a user