Compare commits

..

No commits in common. "master" and "2025.06.09" have entirely different histories.

36 changed files with 844 additions and 1453 deletions

View File

@ -1,41 +0,0 @@
name: Signature Tests
on:
push:
paths:
- .github/workflows/signature-tests.yml
- test/test_youtube_signature.py
- yt_dlp/jsinterp.py
pull_request:
paths:
- .github/workflows/signature-tests.yml
- test/test_youtube_signature.py
- yt_dlp/jsinterp.py
permissions:
contents: read
concurrency:
group: signature-tests-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: ${{ github.event_name == 'pull_request' }}
jobs:
tests:
name: Signature Tests
runs-on: ${{ matrix.os }}
strategy:
fail-fast: false
matrix:
os: [ubuntu-latest, windows-latest]
python-version: ['3.9', '3.10', '3.11', '3.12', '3.13', pypy-3.10, pypy-3.11]
steps:
- uses: actions/checkout@v4
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python-version }}
- name: Install test requirements
run: python3 ./devscripts/install_deps.py --only-optional --include test
- name: Run tests
timeout-minutes: 15
run: |
python3 -m yt_dlp -v || true # Print debug head
python3 ./devscripts/run_tests.py test/test_youtube_signature.py

View File

@ -779,8 +779,3 @@ brian6932
iednod55
maxbin123
nullpos
anlar
eason1478
ceandreasen
chauhantirth
helpimnotdrowning

View File

@ -4,48 +4,6 @@
# To create a release, dispatch the https://github.com/yt-dlp/yt-dlp/actions/workflows/release.yml workflow on master
-->
### 2025.06.30
#### Core changes
- **jsinterp**: [Fix `extract_object`](https://github.com/yt-dlp/yt-dlp/commit/958153a226214c86879e36211ac191bf78289578) ([#13580](https://github.com/yt-dlp/yt-dlp/issues/13580)) by [seproDev](https://github.com/seproDev)
#### Extractor changes
- **bilibilispacevideo**: [Extract hidden-mode collections as playlists](https://github.com/yt-dlp/yt-dlp/commit/99b85ac102047446e6adf5b62bfc3c8d80b53778) ([#13533](https://github.com/yt-dlp/yt-dlp/issues/13533)) by [c-basalt](https://github.com/c-basalt)
- **hotstar**
- [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/b5bd057fe86550f3aa67f2fc8790d1c6a251c57b) ([#13530](https://github.com/yt-dlp/yt-dlp/issues/13530)) by [bashonly](https://github.com/bashonly), [chauhantirth](https://github.com/chauhantirth) (With fixes in [e9f1576](https://github.com/yt-dlp/yt-dlp/commit/e9f157669e24953a88d15ce22053649db7a8e81e) by [bashonly](https://github.com/bashonly))
- [Fix metadata extraction](https://github.com/yt-dlp/yt-dlp/commit/0a6b1044899f452cd10b6c7a6b00fa985a9a8b97) ([#13560](https://github.com/yt-dlp/yt-dlp/issues/13560)) by [bashonly](https://github.com/bashonly)
- [Raise for login required](https://github.com/yt-dlp/yt-dlp/commit/5e292baad62c749b6c340621ab2d0f904165ddfb) ([#10405](https://github.com/yt-dlp/yt-dlp/issues/10405)) by [bashonly](https://github.com/bashonly)
- series: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/4bd9a7ade7e0508b9795b3e72a69eeb40788b62b) ([#13564](https://github.com/yt-dlp/yt-dlp/issues/13564)) by [bashonly](https://github.com/bashonly)
- **jiocinema**: [Remove extractors](https://github.com/yt-dlp/yt-dlp/commit/7e2504f941a11ea2b0dba00de3f0295cdc253e79) ([#13565](https://github.com/yt-dlp/yt-dlp/issues/13565)) by [bashonly](https://github.com/bashonly)
- **kick**: [Support subscriber-only content](https://github.com/yt-dlp/yt-dlp/commit/b16722ede83377f77ea8352dcd0a6ca8e83b8f0f) ([#13550](https://github.com/yt-dlp/yt-dlp/issues/13550)) by [helpimnotdrowning](https://github.com/helpimnotdrowning)
- **niconico**: live: [Fix extractor and downloader](https://github.com/yt-dlp/yt-dlp/commit/06c1a8cdffe14050206683253726875144192ef5) ([#13158](https://github.com/yt-dlp/yt-dlp/issues/13158)) by [doe1080](https://github.com/doe1080)
- **sauceplus**: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/35fc33fbc51c7f5392fb2300f65abf6cf107ef90) ([#13567](https://github.com/yt-dlp/yt-dlp/issues/13567)) by [bashonly](https://github.com/bashonly), [ceandreasen](https://github.com/ceandreasen)
- **sproutvideo**: [Support browser impersonation](https://github.com/yt-dlp/yt-dlp/commit/11b9416e10cff7513167d76d6c47774fcdd3e26a) ([#13589](https://github.com/yt-dlp/yt-dlp/issues/13589)) by [bashonly](https://github.com/bashonly)
- **youtube**: [Fix premium formats extraction](https://github.com/yt-dlp/yt-dlp/commit/2ba5391cd68ed4f2415c827d2cecbcbc75ace10b) ([#13586](https://github.com/yt-dlp/yt-dlp/issues/13586)) by [bashonly](https://github.com/bashonly)
#### Misc. changes
- **ci**: [Add signature tests](https://github.com/yt-dlp/yt-dlp/commit/1b883846347addeab12663fd74317fd544341a1c) ([#13582](https://github.com/yt-dlp/yt-dlp/issues/13582)) by [bashonly](https://github.com/bashonly)
- **cleanup**: Miscellaneous: [b018784](https://github.com/yt-dlp/yt-dlp/commit/b0187844988e557c7e1e6bb1aabd4c1176768d86) by [bashonly](https://github.com/bashonly)
### 2025.06.25
#### Extractor changes
- [Add `_search_nuxt_json` helper](https://github.com/yt-dlp/yt-dlp/commit/51887484e46ab6015c041cb1ab626a55f25a03bd) ([#13386](https://github.com/yt-dlp/yt-dlp/issues/13386)) by [bashonly](https://github.com/bashonly), [Grub4K](https://github.com/Grub4K)
- **brightcove**: new: [Improve metadata extraction](https://github.com/yt-dlp/yt-dlp/commit/e6bd4a3da295b760ab20b39c18ce8934d312c2bf) ([#13461](https://github.com/yt-dlp/yt-dlp/issues/13461)) by [doe1080](https://github.com/doe1080)
- **huya**: live: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/2600849badb0d08c55b58dcc77a13af6ba423da6) ([#13520](https://github.com/yt-dlp/yt-dlp/issues/13520)) by [doe1080](https://github.com/doe1080)
- **hypergryph**: [Improve metadata extraction](https://github.com/yt-dlp/yt-dlp/commit/1722c55400ff30bb5aee5dd7a262f0b7e9ce2f0e) ([#13415](https://github.com/yt-dlp/yt-dlp/issues/13415)) by [doe1080](https://github.com/doe1080), [eason1478](https://github.com/eason1478)
- **lsm**: [Fix extractors](https://github.com/yt-dlp/yt-dlp/commit/c57412d1f9cf0124adc972a47858ac42b740c61d) ([#13126](https://github.com/yt-dlp/yt-dlp/issues/13126)) by [Caesim404](https://github.com/Caesim404)
- **mave**: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/1838a1ce5d4ade80770ba9162eaffc9a1607dc70) ([#13380](https://github.com/yt-dlp/yt-dlp/issues/13380)) by [anlar](https://github.com/anlar)
- **sportdeutschland**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/a4ce4327c9836691d3b6b00e44a90b6741601ed8) ([#13519](https://github.com/yt-dlp/yt-dlp/issues/13519)) by [DTrombett](https://github.com/DTrombett)
- **sproutvideo**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/5b559d0072b7164daf06bacdc41c6f11283452c8) ([#13544](https://github.com/yt-dlp/yt-dlp/issues/13544)) by [bashonly](https://github.com/bashonly)
- **tv8.it**: [Support slugless URLs](https://github.com/yt-dlp/yt-dlp/commit/3bd30291601c47fa4a257983473884103ecab0c7) ([#13478](https://github.com/yt-dlp/yt-dlp/issues/13478)) by [DTrombett](https://github.com/DTrombett)
- **youtube**
- [Check any `ios` m3u8 formats prior to download](https://github.com/yt-dlp/yt-dlp/commit/8f94b76cbf7bbd9dfd8762c63cdea04f90f1297f) ([#13524](https://github.com/yt-dlp/yt-dlp/issues/13524)) by [bashonly](https://github.com/bashonly)
- [Improve player context payloads](https://github.com/yt-dlp/yt-dlp/commit/ff6f94041aeee19c5559e1c1cd693960a1c1dd14) ([#13539](https://github.com/yt-dlp/yt-dlp/issues/13539)) by [bashonly](https://github.com/bashonly)
#### Misc. changes
- **test**: `traversal`: [Fix morsel tests for Python 3.14](https://github.com/yt-dlp/yt-dlp/commit/73bf10211668e4a59ccafd790e06ee82d9fea9ea) ([#13471](https://github.com/yt-dlp/yt-dlp/issues/13471)) by [Grub4K](https://github.com/Grub4K)
### 2025.06.09
#### Extractor changes

View File

@ -254,13 +254,5 @@
{
"action": "remove",
"when": "d596824c2f8428362c072518856065070616e348"
},
{
"action": "remove",
"when": "7b81634fb1d15999757e7a9883daa6ef09ea785b"
},
{
"action": "remove",
"when": "500761e41acb96953a5064e951d41d190c287e46"
}
]

View File

@ -575,7 +575,9 @@ The only reliable way to check if a site is supported is to try it.
- **HollywoodReporterPlaylist**
- **Holodex**
- **HotNewHipHop**: (**Currently broken**)
- **hotstar**: JioHotstar
- **hotstar**
- **hotstar:playlist**
- **hotstar:season**
- **hotstar:series**
- **hrfernsehen**
- **HRTi**: [*hrti*](## "netrc machine")
@ -588,7 +590,7 @@ The only reliable way to check if a site is supported is to try it.
- **Hungama**
- **HungamaAlbumPlaylist**
- **HungamaSong**
- **huya:live**: 虎牙直播
- **huya:live**: huya.com
- **huya:video**: 虎牙视频
- **Hypem**
- **Hytale**
@ -645,6 +647,8 @@ The only reliable way to check if a site is supported is to try it.
- **Jamendo**
- **JamendoAlbum**
- **JeuxVideo**: (**Currently broken**)
- **jiocinema**: [*jiocinema*](## "netrc machine")
- **jiocinema:series**: [*jiocinema*](## "netrc machine")
- **jiosaavn:album**
- **jiosaavn:artist**
- **jiosaavn:playlist**
@ -772,7 +776,6 @@ The only reliable way to check if a site is supported is to try it.
- **massengeschmack.tv**
- **Masters**
- **MatchTV**
- **Mave**
- **MBN**: mbn.co.kr (매일방송)
- **MDR**: MDR.DE
- **MedalTV**
@ -829,7 +832,7 @@ The only reliable way to check if a site is supported is to try it.
- **Mojevideo**: mojevideo.sk
- **Mojvideo**
- **Monstercat**
- **monstersiren**: 塞壬唱片
- **MonsterSirenHypergryphMusic**
- **Motherless**
- **MotherlessGallery**
- **MotherlessGroup**
@ -1295,7 +1298,6 @@ The only reliable way to check if a site is supported is to try it.
- **SampleFocus**
- **Sangiin**: 参議院インターネット審議中継 (archive)
- **Sapo**: SAPO Vídeos
- **SaucePlus**: Sauce+
- **SBS**: sbs.com.au
- **sbs.co.kr**
- **sbs.co.kr:allvod_program**

View File

@ -1947,137 +1947,6 @@ jwplayer("mediaplayer").setup({"abouttext":"Visit Indie DB","aboutlink":"http:\/
with self.assertWarns(DeprecationWarning):
self.assertEqual(self.ie._search_nextjs_data('', None, default='{}'), {})
def test_search_nuxt_json(self):
HTML_TMPL = '<script data-ssr="true" id="__NUXT_DATA__" type="application/json">[{}]</script>'
VALID_DATA = '''
["ShallowReactive",1],
{"data":2,"state":21,"once":25,"_errors":28,"_server_errors":30},
["ShallowReactive",3],
{"$abcdef123456":4},
{"podcast":5,"activeEpisodeData":7},
{"podcast":6,"seasons":14},
{"title":10,"id":11},
["Reactive",8],
{"episode":9,"creators":18,"empty_list":20},
{"title":12,"id":13,"refs":34,"empty_refs":35},
"Series Title",
"podcast-id-01",
"Episode Title",
"episode-id-99",
[15,16,17],
1,
2,
3,
[19],
"Podcast Creator",
[],
{"$ssite-config":22},
{"env":23,"name":24,"map":26,"numbers":14},
"production",
"podcast-website",
["Set"],
["Reactive",27],
["Map"],
["ShallowReactive",29],
{},
["NuxtError",31],
{"status":32,"message":33},
503,
"Service Unavailable",
[36,37],
[38,39],
["Ref",40],
["ShallowRef",41],
["EmptyRef",42],
["EmptyShallowRef",43],
"ref",
"shallow_ref",
"{\\"ref\\":1}",
"{\\"shallow_ref\\":2}"
'''
PAYLOAD = {
'data': {
'$abcdef123456': {
'podcast': {
'podcast': {
'title': 'Series Title',
'id': 'podcast-id-01',
},
'seasons': [1, 2, 3],
},
'activeEpisodeData': {
'episode': {
'title': 'Episode Title',
'id': 'episode-id-99',
'refs': ['ref', 'shallow_ref'],
'empty_refs': [{'ref': 1}, {'shallow_ref': 2}],
},
'creators': ['Podcast Creator'],
'empty_list': [],
},
},
},
'state': {
'$ssite-config': {
'env': 'production',
'name': 'podcast-website',
'map': [],
'numbers': [1, 2, 3],
},
},
'once': [],
'_errors': {},
'_server_errors': {
'status': 503,
'message': 'Service Unavailable',
},
}
PARTIALLY_INVALID = [(
'''
{"data":1},
{"invalid_raw_list":2},
[15,16,17]
''',
{'data': {'invalid_raw_list': [None, None, None]}},
), (
'''
{"data":1},
["EmptyRef",2],
"not valid JSON"
''',
{'data': None},
), (
'''
{"data":1},
["EmptyShallowRef",2],
"not valid JSON"
''',
{'data': None},
)]
INVALID = [
'''
[]
''',
'''
["unsupported",1],
{"data":2},
{}
''',
]
DEFAULT = object()
self.assertEqual(self.ie._search_nuxt_json(HTML_TMPL.format(VALID_DATA), None), PAYLOAD)
self.assertEqual(self.ie._search_nuxt_json('', None, fatal=False), {})
self.assertIs(self.ie._search_nuxt_json('', None, default=DEFAULT), DEFAULT)
for data, expected in PARTIALLY_INVALID:
self.assertEqual(
self.ie._search_nuxt_json(HTML_TMPL.format(data), None, fatal=False), expected)
for data in INVALID:
self.assertIs(
self.ie._search_nuxt_json(HTML_TMPL.format(data), None, default=DEFAULT), DEFAULT)
if __name__ == '__main__':
unittest.main()

View File

@ -1,235 +0,0 @@
#!/usr/bin/env python3
# Allow direct execution
import os
import sys
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
import datetime as dt
import json
import math
import re
import unittest
from yt_dlp.utils.jslib import devalue
TEST_CASES_EQUALS = [{
'name': 'int',
'unparsed': [-42],
'parsed': -42,
}, {
'name': 'str',
'unparsed': ['woo!!!'],
'parsed': 'woo!!!',
}, {
'name': 'Number',
'unparsed': [['Object', 42]],
'parsed': 42,
}, {
'name': 'String',
'unparsed': [['Object', 'yar']],
'parsed': 'yar',
}, {
'name': 'Infinity',
'unparsed': -4,
'parsed': math.inf,
}, {
'name': 'negative Infinity',
'unparsed': -5,
'parsed': -math.inf,
}, {
'name': 'negative zero',
'unparsed': -6,
'parsed': -0.0,
}, {
'name': 'RegExp',
'unparsed': [['RegExp', 'regexp', 'gim']], # XXX: flags are ignored
'parsed': re.compile('regexp'),
}, {
'name': 'Date',
'unparsed': [['Date', '2001-09-09T01:46:40.000Z']],
'parsed': dt.datetime.fromtimestamp(1e9, tz=dt.timezone.utc),
}, {
'name': 'Array',
'unparsed': [[1, 2, 3], 'a', 'b', 'c'],
'parsed': ['a', 'b', 'c'],
}, {
'name': 'Array (empty)',
'unparsed': [[]],
'parsed': [],
}, {
'name': 'Array (sparse)',
'unparsed': [[-2, 1, -2], 'b'],
'parsed': [None, 'b', None],
}, {
'name': 'Object',
'unparsed': [{'foo': 1, 'x-y': 2}, 'bar', 'z'],
'parsed': {'foo': 'bar', 'x-y': 'z'},
}, {
'name': 'Set',
'unparsed': [['Set', 1, 2, 3], 1, 2, 3],
'parsed': [1, 2, 3],
}, {
'name': 'Map',
'unparsed': [['Map', 1, 2], 'a', 'b'],
'parsed': [['a', 'b']],
}, {
'name': 'BigInt',
'unparsed': [['BigInt', '1']],
'parsed': 1,
}, {
'name': 'Uint8Array',
'unparsed': [['Uint8Array', 'AQID']],
'parsed': [1, 2, 3],
}, {
'name': 'ArrayBuffer',
'unparsed': [['ArrayBuffer', 'AQID']],
'parsed': [1, 2, 3],
}, {
'name': 'str (repetition)',
'unparsed': [[1, 1], 'a string'],
'parsed': ['a string', 'a string'],
}, {
'name': 'None (repetition)',
'unparsed': [[1, 1], None],
'parsed': [None, None],
}, {
'name': 'dict (repetition)',
'unparsed': [[1, 1], {}],
'parsed': [{}, {}],
}, {
'name': 'Object without prototype',
'unparsed': [['null']],
'parsed': {},
}, {
'name': 'cross-realm POJO',
'unparsed': [{}],
'parsed': {},
}]
TEST_CASES_IS = [{
'name': 'bool',
'unparsed': [True],
'parsed': True,
}, {
'name': 'Boolean',
'unparsed': [['Object', False]],
'parsed': False,
}, {
'name': 'undefined',
'unparsed': -1,
'parsed': None,
}, {
'name': 'null',
'unparsed': [None],
'parsed': None,
}, {
'name': 'NaN',
'unparsed': -3,
'parsed': math.nan,
}]
TEST_CASES_INVALID = [{
'name': 'empty string',
'unparsed': '',
'error': ValueError,
'pattern': r'expected int or list as input',
}, {
'name': 'hole',
'unparsed': -2,
'error': ValueError,
'pattern': r'invalid integer input',
}, {
'name': 'string',
'unparsed': 'hello',
'error': ValueError,
'pattern': r'expected int or list as input',
}, {
'name': 'number',
'unparsed': 42,
'error': ValueError,
'pattern': r'invalid integer input',
}, {
'name': 'boolean',
'unparsed': True,
'error': ValueError,
'pattern': r'expected int or list as input',
}, {
'name': 'null',
'unparsed': None,
'error': ValueError,
'pattern': r'expected int or list as input',
}, {
'name': 'object',
'unparsed': {},
'error': ValueError,
'pattern': r'expected int or list as input',
}, {
'name': 'empty array',
'unparsed': [],
'error': ValueError,
'pattern': r'expected a non-empty list as input',
}, {
'name': 'Python negative indexing',
'unparsed': [[1, 2, 3, 4, 5, 6, 7, -7], 1, 2, 3, 4, 5, 6, 7],
'error': IndexError,
'pattern': r'invalid index: -7',
}]
class TestDevalue(unittest.TestCase):
def test_devalue_parse_equals(self):
for tc in TEST_CASES_EQUALS:
self.assertEqual(devalue.parse(tc['unparsed']), tc['parsed'], tc['name'])
def test_devalue_parse_is(self):
for tc in TEST_CASES_IS:
self.assertIs(devalue.parse(tc['unparsed']), tc['parsed'], tc['name'])
def test_devalue_parse_invalid(self):
for tc in TEST_CASES_INVALID:
with self.assertRaisesRegex(tc['error'], tc['pattern'], msg=tc['name']):
devalue.parse(tc['unparsed'])
def test_devalue_parse_cyclical(self):
name = 'Map (cyclical)'
result = devalue.parse([['Map', 1, 0], 'self'])
self.assertEqual(result[0][0], 'self', name)
self.assertIs(result, result[0][1], name)
name = 'Set (cyclical)'
result = devalue.parse([['Set', 0, 1], 42])
self.assertEqual(result[1], 42, name)
self.assertIs(result, result[0], name)
result = devalue.parse([[0]])
self.assertIs(result, result[0], 'Array (cyclical)')
name = 'Object (cyclical)'
result = devalue.parse([{'self': 0}])
self.assertIs(result, result['self'], name)
name = 'Object with null prototype (cyclical)'
result = devalue.parse([['null', 'self', 0]])
self.assertIs(result, result['self'], name)
name = 'Objects (cyclical)'
result = devalue.parse([[1, 2], {'second': 2}, {'first': 1}])
self.assertIs(result[0], result[1]['first'], name)
self.assertIs(result[1], result[0]['second'], name)
def test_devalue_parse_revivers(self):
self.assertEqual(
devalue.parse([['indirect', 1], {'a': 2}, 'b'], revivers={'indirect': lambda x: x}),
{'a': 'b'}, 'revivers (indirect)')
self.assertEqual(
devalue.parse([['parse', 1], '{"a":0}'], revivers={'parse': lambda x: json.loads(x)}),
{'a': 0}, 'revivers (parse)')
if __name__ == '__main__':
unittest.main()

View File

@ -478,10 +478,6 @@ class TestJSInterpreter(unittest.TestCase):
func = jsi.extract_function('c', {'e': 10}, {'f': 100, 'g': 1000})
self.assertEqual(func([1]), 1111)
def test_extract_object(self):
jsi = JSInterpreter('var a={};a.xy={};var xy;var zxy={};xy={z:function(){return "abc"}};')
self.assertTrue('z' in jsi.extract_object('xy', None))
def test_increment_decrement(self):
self._test('function f() { var x = 1; return ++x; }', 2)
self._test('function f() { var x = 1; return x++; }', 1)

View File

@ -416,8 +416,18 @@ class TestTraversal:
'`any` should allow further branching'
def test_traversal_morsel(self):
values = {
'expires': 'a',
'path': 'b',
'comment': 'c',
'domain': 'd',
'max-age': 'e',
'secure': 'f',
'httponly': 'g',
'version': 'h',
'samesite': 'i',
}
morsel = http.cookies.Morsel()
values = dict(zip(morsel, 'abcdefghijklmnop'))
morsel.set('item_key', 'item_value', 'coded_value')
morsel.update(values)
values['key'] = 'item_key'

View File

@ -133,11 +133,6 @@ _SIG_TESTS = [
'2aq0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpzEICs69VdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
'IAOAOq0QJ8wRAAgXmPlOPSBkkUs1bYFYlJCfe29xx8j7v1pDL0QwbdV96sCIEzpWqMGkFR20CFOg51Tp-7vj_E2u-m37KtXJoOySqa0',
),
(
'https://www.youtube.com/s/player/e12fbea4/player_ias.vflset/en_US/base.js',
'gN7a-hudCuAuPH6fByOk1_GNXN0yNMHShjZXS2VOgsEItAJz0tipeavEOmNdYN-wUtcEqD3bCXjc0iyKfAyZxCBGgIARwsSdQfJ2CJtt',
'JC2JfQdSswRAIgGBCxZyAfKyi0cjXCb3DqEctUw-NYdNmOEvaepit0zJAtIEsgOV2SXZjhSHMNy0NXNG_1kOyBf6HPuAuCduh-a',
),
]
_NSIG_TESTS = [

View File

@ -2219,7 +2219,6 @@ class YoutubeDL:
self.report_warning(f'Unable to delete temporary file "{temp_file.name}"')
f['__working'] = success
if success:
f.pop('__needs_testing', None)
yield f
else:
self.to_screen('[info] Unable to download format {}. Skipping...'.format(f['format_id']))
@ -3964,7 +3963,6 @@ class YoutubeDL:
self._format_out('UNSUPPORTED', self.Styles.BAD_FORMAT) if f.get('ext') in ('f4f', 'f4m') else None,
(self._format_out('Maybe DRM', self.Styles.WARNING) if f.get('has_drm') == 'maybe'
else self._format_out('DRM', self.Styles.BAD_FORMAT) if f.get('has_drm') else None),
self._format_out('Untested', self.Styles.WARNING) if f.get('__needs_testing') else None,
format_field(f, 'format_note'),
format_field(f, 'container', ignore=(None, f.get('ext'))),
delim=', '), delim=' '),

View File

@ -5,46 +5,47 @@ import time
from .common import FileDownloader
from .external import FFmpegFD
from ..networking import Request
from ..networking.websocket import WebSocketResponse
from ..utils import DownloadError, str_or_none, truncate_string
from ..utils.traversal import traverse_obj
from ..utils import DownloadError, str_or_none, try_get
class NiconicoLiveFD(FileDownloader):
""" Downloads niconico live without being stopped """
def real_download(self, filename, info_dict):
video_id = info_dict['id']
opts = info_dict['downloader_options']
quality, ws_extractor, ws_url = opts['max_quality'], opts['ws'], opts['ws_url']
video_id = info_dict['video_id']
ws_url = info_dict['url']
ws_extractor = info_dict['ws']
ws_origin_host = info_dict['origin']
live_quality = info_dict.get('live_quality', 'high')
live_latency = info_dict.get('live_latency', 'high')
dl = FFmpegFD(self.ydl, self.params or {})
new_info_dict = info_dict.copy()
new_info_dict['protocol'] = 'm3u8'
new_info_dict.update({
'protocol': 'm3u8',
})
def communicate_ws(reconnect):
# Support --load-info-json as if it is a reconnect attempt
if reconnect or not isinstance(ws_extractor, WebSocketResponse):
ws = self.ydl.urlopen(Request(
ws_url, headers={'Origin': 'https://live.nicovideo.jp'}))
if reconnect:
ws = self.ydl.urlopen(Request(ws_url, headers={'Origin': f'https://{ws_origin_host}'}))
if self.ydl.params.get('verbose', False):
self.write_debug('Sending startWatching request')
self.to_screen('[debug] Sending startWatching request')
ws.send(json.dumps({
'type': 'startWatching',
'data': {
'reconnect': True,
'room': {
'commentable': True,
'protocol': 'webSocket',
},
'stream': {
'quality': live_quality,
'protocol': 'hls+fmp4',
'latency': live_latency,
'accessRightMethod': 'single_cookie',
'chasePlay': False,
'latency': 'high',
'protocol': 'hls',
'quality': quality,
},
'room': {
'protocol': 'webSocket',
'commentable': True,
},
'reconnect': True,
},
'type': 'startWatching',
}))
else:
ws = ws_extractor
@ -57,6 +58,7 @@ class NiconicoLiveFD(FileDownloader):
if not data or not isinstance(data, dict):
continue
if data.get('type') == 'ping':
# pong back
ws.send(r'{"type":"pong"}')
ws.send(r'{"type":"keepSeat"}')
elif data.get('type') == 'disconnect':
@ -64,10 +66,12 @@ class NiconicoLiveFD(FileDownloader):
return True
elif data.get('type') == 'error':
self.write_debug(data)
message = traverse_obj(data, ('body', 'code', {str_or_none}), default=recv)
message = try_get(data, lambda x: x['body']['code'], str) or recv
return DownloadError(message)
elif self.ydl.params.get('verbose', False):
self.write_debug(f'Server response: {truncate_string(recv, 100)}')
if len(recv) > 100:
recv = recv[:100] + '...'
self.to_screen(f'[debug] Server said: {recv}')
def ws_main():
reconnect = False
@ -77,8 +81,7 @@ class NiconicoLiveFD(FileDownloader):
if ret is True:
return
except BaseException as e:
self.to_screen(
f'[niconico:live] {video_id}: Connection error occured, reconnecting after 10 seconds: {e}')
self.to_screen('[{}] {}: Connection error occured, reconnecting after 10 seconds: {}'.format('niconico:live', video_id, str_or_none(e)))
time.sleep(10)
continue
finally:

View File

@ -805,7 +805,9 @@ from .holodex import HolodexIE
from .hotnewhiphop import HotNewHipHopIE
from .hotstar import (
HotStarIE,
HotStarPlaylistIE,
HotStarPrefixIE,
HotStarSeasonIE,
HotStarSeriesIE,
)
from .hrefli import HrefLiRedirectIE
@ -919,6 +921,10 @@ from .japandiet import (
ShugiinItvVodIE,
)
from .jeuxvideo import JeuxVideoIE
from .jiocinema import (
JioCinemaIE,
JioCinemaSeriesIE,
)
from .jiosaavn import (
JioSaavnAlbumIE,
JioSaavnArtistIE,
@ -1101,7 +1107,6 @@ from .markiza import (
from .massengeschmacktv import MassengeschmackTVIE
from .masters import MastersIE
from .matchtv import MatchTVIE
from .mave import MaveIE
from .mbn import MBNIE
from .mdr import MDRIE
from .medaltv import MedalTVIE
@ -1824,7 +1829,6 @@ from .safari import (
from .saitosan import SaitosanIE
from .samplefocus import SampleFocusIE
from .sapo import SapoIE
from .sauceplus import SaucePlusIE
from .sbs import SBSIE
from .sbscokr import (
SBSCoKrAllvodProgramIE,

View File

@ -1226,26 +1226,6 @@ class BilibiliSpaceVideoIE(BilibiliSpaceBaseIE):
'id': '313580179',
},
'playlist_mincount': 92,
}, {
# Hidden-mode collection
'url': 'https://space.bilibili.com/3669403/video',
'info_dict': {
'id': '3669403',
},
'playlist': [{
'info_dict': {
'_type': 'playlist',
'id': '3669403_3958082',
'title': '合集·直播回放',
'description': '',
'uploader': '月路Yuel',
'uploader_id': '3669403',
'timestamp': int,
'upload_date': str,
'thumbnail': str,
},
}],
'params': {'playlist_items': '7'},
}]
def _real_extract(self, url):
@ -1302,13 +1282,7 @@ class BilibiliSpaceVideoIE(BilibiliSpaceBaseIE):
}
def get_entries(page_data):
for entry in traverse_obj(page_data, ('list', 'vlist', ..., {dict})):
if traverse_obj(entry, ('meta', 'attribute')) == 156:
# hidden-mode collection doesn't show its videos in uploads; extract as playlist instead
yield self.url_result(
f'https://space.bilibili.com/{entry["mid"]}/lists/{entry["meta"]["id"]}?type=season',
BilibiliCollectionListIE, f'{entry["mid"]}_{entry["meta"]["id"]}')
else:
for entry in traverse_obj(page_data, ('list', 'vlist')) or []:
yield self.url_result(f'https://www.bilibili.com/video/{entry["bvid"]}', BiliBiliIE, entry['bvid'])
metadata, paged_list = self._extract_playlist(fetch_page, get_metadata, get_entries)

View File

@ -495,6 +495,8 @@ class BrightcoveLegacyIE(InfoExtractor):
class BrightcoveNewBaseIE(AdobePassIE):
def _parse_brightcove_metadata(self, json_data, video_id, headers={}):
title = json_data['name'].strip()
formats, subtitles = [], {}
sources = json_data.get('sources') or []
for source in sources:
@ -598,18 +600,16 @@ class BrightcoveNewBaseIE(AdobePassIE):
return {
'id': video_id,
'title': title,
'description': clean_html(json_data.get('description')),
'thumbnails': thumbnails,
'duration': duration,
'timestamp': parse_iso8601(json_data.get('published_at')),
'uploader_id': json_data.get('account_id'),
'formats': formats,
'subtitles': subtitles,
'tags': json_data.get('tags', []),
'is_live': is_live,
**traverse_obj(json_data, {
'title': ('name', {clean_html}),
'description': ('description', {clean_html}),
'tags': ('tags', ..., {str}, filter, all, filter),
'timestamp': ('published_at', {parse_iso8601}),
'uploader_id': ('account_id', {str}),
}),
}
@ -645,7 +645,10 @@ class BrightcoveNewIE(BrightcoveNewBaseIE):
'uploader_id': '4036320279001',
'formats': 'mincount:39',
},
'skip': '404 Not Found',
'params': {
# m3u8 download
'skip_download': True,
},
}, {
# playlist stream
'url': 'https://players.brightcove.net/1752604059001/S13cJdUBz_default/index.html?playlistId=5718313430001',
@ -706,6 +709,7 @@ class BrightcoveNewIE(BrightcoveNewBaseIE):
'ext': 'mp4',
'title': 'TGD_01-032_5',
'thumbnail': r're:^https?://.*\.jpg$',
'tags': [],
'timestamp': 1646078943,
'uploader_id': '1569565978001',
'upload_date': '20220228',
@ -717,6 +721,7 @@ class BrightcoveNewIE(BrightcoveNewBaseIE):
'ext': 'mp4',
'title': 'TGD 01-087 (Airs 05.25.22)_Segment 5',
'thumbnail': r're:^https?://.*\.jpg$',
'tags': [],
'timestamp': 1651604591,
'uploader_id': '1569565978001',
'upload_date': '20220503',

View File

@ -11,7 +11,7 @@ from ..utils.traversal import traverse_obj
class CloudyCDNIE(InfoExtractor):
_VALID_URL = r'(?:https?:)?//embed\.(?P<domain>cloudycdn\.services|backscreen\.com)/(?P<site_id>[^/?#]+)/media/(?P<id>[\w-]+)'
_VALID_URL = r'(?:https?:)?//embed\.cloudycdn\.services/(?P<site_id>[^/?#]+)/media/(?P<id>[\w-]+)'
_EMBED_REGEX = [rf'<iframe[^>]+\bsrc=[\'"](?P<url>{_VALID_URL})']
_TESTS = [{
'url': 'https://embed.cloudycdn.services/ltv/media/46k_d23-6000-105?',
@ -23,7 +23,7 @@ class CloudyCDNIE(InfoExtractor):
'duration': 1442,
'upload_date': '20231121',
'title': 'D23-6000-105_cetstud',
'thumbnail': 'https://store.bstrm.net/tmsp00060/assets/media/660858/placeholder1700589200.jpg',
'thumbnail': 'https://store.cloudycdn.services/tmsp00060/assets/media/660858/placeholder1700589200.jpg',
},
}, {
'url': 'https://embed.cloudycdn.services/izm/media/26e_lv-8-5-1',
@ -33,7 +33,7 @@ class CloudyCDNIE(InfoExtractor):
'ext': 'mp4',
'title': 'LV-8-5-1',
'timestamp': 1669767167,
'thumbnail': 'https://store.bstrm.net/tmsp00120/assets/media/488306/placeholder1679423604.jpg',
'thumbnail': 'https://store.cloudycdn.services/tmsp00120/assets/media/488306/placeholder1679423604.jpg',
'duration': 1205,
'upload_date': '20221130',
},
@ -48,21 +48,9 @@ class CloudyCDNIE(InfoExtractor):
'duration': 1673,
'title': 'D24-6000-074-cetstud',
'timestamp': 1718902233,
'thumbnail': 'https://store.bstrm.net/tmsp00060/assets/media/788392/placeholder1718903938.jpg',
'thumbnail': 'https://store.cloudycdn.services/tmsp00060/assets/media/788392/placeholder1718903938.jpg',
},
'params': {'format': 'bv'},
}, {
'url': 'https://embed.backscreen.com/ltv/media/32j_z25-0600-127?',
'md5': '9b6fa09ac1a4de53d4f42b94affc3b42',
'info_dict': {
'id': '32j_z25-0600-127',
'ext': 'mp4',
'title': 'Z25-0600-127-DZ',
'duration': 1906,
'thumbnail': 'https://store.bstrm.net/tmsp00060/assets/media/977427/placeholder1746633646.jpg',
'timestamp': 1746632402,
'upload_date': '20250507',
},
}]
_WEBPAGE_TESTS = [{
'url': 'https://www.tavaklase.lv/video/es-esmu-mina-um-2/',
@ -72,17 +60,17 @@ class CloudyCDNIE(InfoExtractor):
'ext': 'mp4',
'upload_date': '20230223',
'duration': 629,
'thumbnail': 'https://store.bstrm.net/tmsp00120/assets/media/518407/placeholder1678748124.jpg',
'thumbnail': 'https://store.cloudycdn.services/tmsp00120/assets/media/518407/placeholder1678748124.jpg',
'timestamp': 1677181513,
'title': 'LIB-2',
},
}]
def _real_extract(self, url):
domain, site_id, video_id = self._match_valid_url(url).group('domain', 'site_id', 'id')
site_id, video_id = self._match_valid_url(url).group('site_id', 'id')
data = self._download_json(
f'https://player.{domain}/player/{site_id}/media/{video_id}/',
f'https://player.cloudycdn.services/player/{site_id}/media/{video_id}/',
video_id, data=urlencode_postdata({
'version': '6.4.0',
'referer': url,

View File

@ -101,7 +101,6 @@ from ..utils import (
xpath_with_ns,
)
from ..utils._utils import _request_dump_filename
from ..utils.jslib import devalue
class InfoExtractor:
@ -263,9 +262,6 @@ class InfoExtractor:
* http_chunk_size Chunk size for HTTP downloads
* ffmpeg_args Extra arguments for ffmpeg downloader (input)
* ffmpeg_args_out Extra arguments for ffmpeg downloader (output)
* ws (NiconicoLiveFD only) WebSocketResponse
* ws_url (NiconicoLiveFD only) Websockets URL
* max_quality (NiconicoLiveFD only) Max stream quality string
* is_dash_periods Whether the format is a result of merging
multiple DASH periods.
RTMP formats can also have the additional fields: page_url,
@ -1799,63 +1795,6 @@ class InfoExtractor:
ret = self._parse_json(js, video_id, transform_source=functools.partial(js_to_json, vars=args), fatal=fatal)
return traverse_obj(ret, traverse) or {}
def _resolve_nuxt_array(self, array, video_id, *, fatal=True, default=NO_DEFAULT):
"""Resolves Nuxt rich JSON payload arrays"""
# Ref: https://github.com/nuxt/nuxt/commit/9e503be0f2a24f4df72a3ccab2db4d3e63511f57
# https://github.com/nuxt/nuxt/pull/19205
if default is not NO_DEFAULT:
fatal = False
if not isinstance(array, list) or not array:
error_msg = 'Unable to resolve Nuxt JSON data: invalid input'
if fatal:
raise ExtractorError(error_msg, video_id=video_id)
elif default is NO_DEFAULT:
self.report_warning(error_msg, video_id=video_id)
return {} if default is NO_DEFAULT else default
def indirect_reviver(data):
return data
def json_reviver(data):
return json.loads(data)
gen = devalue.parse_iter(array, revivers={
'NuxtError': indirect_reviver,
'EmptyShallowRef': json_reviver,
'EmptyRef': json_reviver,
'ShallowRef': indirect_reviver,
'ShallowReactive': indirect_reviver,
'Ref': indirect_reviver,
'Reactive': indirect_reviver,
})
while True:
try:
error_msg = f'Error resolving Nuxt JSON: {gen.send(None)}'
if fatal:
raise ExtractorError(error_msg, video_id=video_id)
elif default is NO_DEFAULT:
self.report_warning(error_msg, video_id=video_id, only_once=True)
else:
self.write_debug(f'{video_id}: {error_msg}', only_once=True)
except StopIteration as error:
return error.value or ({} if default is NO_DEFAULT else default)
def _search_nuxt_json(self, webpage, video_id, *, fatal=True, default=NO_DEFAULT):
"""Parses metadata from Nuxt rich JSON payloads embedded in HTML"""
passed_default = default is not NO_DEFAULT
array = self._search_json(
r'<script\b[^>]+\bid="__NUXT_DATA__"[^>]*>', webpage,
'Nuxt JSON data', video_id, contains_pattern=r'\[(?s:.+)\]',
fatal=fatal, default=NO_DEFAULT if not passed_default else None)
if not array:
return default if passed_default else {}
return self._resolve_nuxt_array(array, video_id, fatal=fatal, default=default)
@staticmethod
def _hidden_inputs(html):
html = re.sub(r'<!--(?:(?!<!--).)*-->', '', html)

View File

@ -17,140 +17,8 @@ from ..utils import (
from ..utils.traversal import traverse_obj
class FloatplaneBaseIE(InfoExtractor):
def _real_extract(self, url):
post_id = self._match_id(url)
post_data = self._download_json(
f'{self._BASE_URL}/api/v3/content/post', post_id, query={'id': post_id},
note='Downloading post data', errnote='Unable to download post data',
impersonate=self._IMPERSONATE_TARGET)
if not any(traverse_obj(post_data, ('metadata', ('hasVideo', 'hasAudio')))):
raise ExtractorError('Post does not contain a video or audio track', expected=True)
uploader_url = format_field(
post_data, [('creator', 'urlname')], f'{self._BASE_URL}/channel/%s/home') or None
common_info = {
'uploader_url': uploader_url,
'channel_url': urljoin(f'{uploader_url}/', traverse_obj(post_data, ('channel', 'urlname'))),
'availability': self._availability(needs_subscription=True),
**traverse_obj(post_data, {
'uploader': ('creator', 'title', {str}),
'uploader_id': ('creator', 'id', {str}),
'channel': ('channel', 'title', {str}),
'channel_id': ('channel', 'id', {str}),
'release_timestamp': ('releaseDate', {parse_iso8601}),
}),
}
items = []
for media in traverse_obj(post_data, (('videoAttachments', 'audioAttachments'), ...)):
media_id = media['id']
media_typ = media.get('type') or 'video'
metadata = self._download_json(
f'{self._BASE_URL}/api/v3/content/{media_typ}', media_id, query={'id': media_id},
note=f'Downloading {media_typ} metadata', impersonate=self._IMPERSONATE_TARGET)
stream = self._download_json(
f'{self._BASE_URL}/api/v2/cdn/delivery', media_id, query={
'type': 'vod' if media_typ == 'video' else 'aod',
'guid': metadata['guid'],
}, note=f'Downloading {media_typ} stream data',
impersonate=self._IMPERSONATE_TARGET)
path_template = traverse_obj(stream, ('resource', 'uri', {str}))
def format_path(params):
path = path_template
for i, val in (params or {}).items():
path = path.replace(f'{{qualityLevelParams.{i}}}', val)
return path
formats = []
for quality in traverse_obj(stream, ('resource', 'data', 'qualityLevels', ...)):
url = urljoin(stream['cdn'], format_path(traverse_obj(
stream, ('resource', 'data', 'qualityLevelParams', quality['name'], {dict}))))
format_id = traverse_obj(quality, ('name', {str}))
hls_aes = {}
m3u8_data = None
# If we need impersonation for the API, then we need it for HLS keys too: extract in advance
if self._IMPERSONATE_TARGET is not None:
m3u8_data = self._download_webpage(
url, media_id, fatal=False, impersonate=self._IMPERSONATE_TARGET, headers=self._HEADERS,
note=join_nonempty('Downloading', format_id, 'm3u8 information', delim=' '),
errnote=join_nonempty('Failed to download', format_id, 'm3u8 information', delim=' '))
if not m3u8_data:
continue
key_url = self._search_regex(
r'#EXT-X-KEY:METHOD=AES-128,URI="(https?://[^"]+)"',
m3u8_data, 'HLS AES key URI', default=None)
if key_url:
urlh = self._request_webpage(
key_url, media_id, fatal=False, impersonate=self._IMPERSONATE_TARGET, headers=self._HEADERS,
note=join_nonempty('Downloading', format_id, 'HLS AES key', delim=' '),
errnote=join_nonempty('Failed to download', format_id, 'HLS AES key', delim=' '))
if urlh:
hls_aes['key'] = urlh.read().hex()
formats.append({
**traverse_obj(quality, {
'format_note': ('label', {str}),
'width': ('width', {int}),
'height': ('height', {int}),
}),
**parse_codecs(quality.get('codecs')),
'url': url,
'ext': determine_ext(url.partition('/chunk.m3u8')[0], 'mp4'),
'format_id': format_id,
'hls_media_playlist_data': m3u8_data,
'hls_aes': hls_aes or None,
})
items.append({
**common_info,
'id': media_id,
**traverse_obj(metadata, {
'title': ('title', {str}),
'duration': ('duration', {int_or_none}),
'thumbnail': ('thumbnail', 'path', {url_or_none}),
}),
'formats': formats,
})
post_info = {
**common_info,
'id': post_id,
'display_id': post_id,
**traverse_obj(post_data, {
'title': ('title', {str}),
'description': ('text', {clean_html}),
'like_count': ('likes', {int_or_none}),
'dislike_count': ('dislikes', {int_or_none}),
'comment_count': ('comments', {int_or_none}),
'thumbnail': ('thumbnail', 'path', {url_or_none}),
}),
'http_headers': self._HEADERS,
}
if len(items) > 1:
return self.playlist_result(items, **post_info)
post_info.update(items[0])
return post_info
class FloatplaneIE(FloatplaneBaseIE):
class FloatplaneIE(InfoExtractor):
_VALID_URL = r'https?://(?:(?:www|beta)\.)?floatplane\.com/post/(?P<id>\w+)'
_BASE_URL = 'https://www.floatplane.com'
_IMPERSONATE_TARGET = None
_HEADERS = {
'Origin': _BASE_URL,
'Referer': f'{_BASE_URL}/',
}
_TESTS = [{
'url': 'https://www.floatplane.com/post/2Yf3UedF7C',
'info_dict': {
@ -302,9 +170,105 @@ class FloatplaneIE(FloatplaneBaseIE):
}]
def _real_initialize(self):
if not self._get_cookies(self._BASE_URL).get('sails.sid'):
if not self._get_cookies('https://www.floatplane.com').get('sails.sid'):
self.raise_login_required()
def _real_extract(self, url):
post_id = self._match_id(url)
post_data = self._download_json(
'https://www.floatplane.com/api/v3/content/post', post_id, query={'id': post_id},
note='Downloading post data', errnote='Unable to download post data')
if not any(traverse_obj(post_data, ('metadata', ('hasVideo', 'hasAudio')))):
raise ExtractorError('Post does not contain a video or audio track', expected=True)
uploader_url = format_field(
post_data, [('creator', 'urlname')], 'https://www.floatplane.com/channel/%s/home') or None
common_info = {
'uploader_url': uploader_url,
'channel_url': urljoin(f'{uploader_url}/', traverse_obj(post_data, ('channel', 'urlname'))),
'availability': self._availability(needs_subscription=True),
**traverse_obj(post_data, {
'uploader': ('creator', 'title', {str}),
'uploader_id': ('creator', 'id', {str}),
'channel': ('channel', 'title', {str}),
'channel_id': ('channel', 'id', {str}),
'release_timestamp': ('releaseDate', {parse_iso8601}),
}),
}
items = []
for media in traverse_obj(post_data, (('videoAttachments', 'audioAttachments'), ...)):
media_id = media['id']
media_typ = media.get('type') or 'video'
metadata = self._download_json(
f'https://www.floatplane.com/api/v3/content/{media_typ}', media_id, query={'id': media_id},
note=f'Downloading {media_typ} metadata')
stream = self._download_json(
'https://www.floatplane.com/api/v2/cdn/delivery', media_id, query={
'type': 'vod' if media_typ == 'video' else 'aod',
'guid': metadata['guid'],
}, note=f'Downloading {media_typ} stream data')
path_template = traverse_obj(stream, ('resource', 'uri', {str}))
def format_path(params):
path = path_template
for i, val in (params or {}).items():
path = path.replace(f'{{qualityLevelParams.{i}}}', val)
return path
formats = []
for quality in traverse_obj(stream, ('resource', 'data', 'qualityLevels', ...)):
url = urljoin(stream['cdn'], format_path(traverse_obj(
stream, ('resource', 'data', 'qualityLevelParams', quality['name'], {dict}))))
formats.append({
**traverse_obj(quality, {
'format_id': ('name', {str}),
'format_note': ('label', {str}),
'width': ('width', {int}),
'height': ('height', {int}),
}),
**parse_codecs(quality.get('codecs')),
'url': url,
'ext': determine_ext(url.partition('/chunk.m3u8')[0], 'mp4'),
})
items.append({
**common_info,
'id': media_id,
**traverse_obj(metadata, {
'title': ('title', {str}),
'duration': ('duration', {int_or_none}),
'thumbnail': ('thumbnail', 'path', {url_or_none}),
}),
'formats': formats,
})
post_info = {
**common_info,
'id': post_id,
'display_id': post_id,
**traverse_obj(post_data, {
'title': ('title', {str}),
'description': ('text', {clean_html}),
'like_count': ('likes', {int_or_none}),
'dislike_count': ('dislikes', {int_or_none}),
'comment_count': ('comments', {int_or_none}),
'thumbnail': ('thumbnail', 'path', {url_or_none}),
}),
}
if len(items) > 1:
return self.playlist_result(items, **post_info)
post_info.update(items[0])
return post_info
class FloatplaneChannelIE(InfoExtractor):
_VALID_URL = r'https?://(?:(?:www|beta)\.)?floatplane\.com/channel/(?P<id>[\w-]+)/home(?:/(?P<channel>[\w-]+))?'

View File

@ -1,4 +1,3 @@
import functools
import hashlib
import hmac
import json
@ -10,20 +9,18 @@ from .common import InfoExtractor
from ..networking.exceptions import HTTPError
from ..utils import (
ExtractorError,
OnDemandPagedList,
determine_ext,
int_or_none,
join_nonempty,
str_or_none,
traverse_obj,
url_or_none,
)
from ..utils.traversal import require, traverse_obj
class HotStarBaseIE(InfoExtractor):
_BASE_URL = 'https://www.hotstar.com'
_API_URL = 'https://api.hotstar.com'
_API_URL_V2 = 'https://apix.hotstar.com/v2'
_AKAMAI_ENCRYPTION_KEY = b'\x05\xfc\x1a\x01\xca\xc9\x4b\xc4\x12\xfc\x53\x12\x07\x75\xf9\xee'
def _call_api_v1(self, path, *args, **kwargs):
@ -32,86 +29,57 @@ class HotStarBaseIE(InfoExtractor):
headers={'x-country-code': 'IN', 'x-platform-code': 'PCTV'})
def _call_api_impl(self, path, video_id, query, st=None, cookies=None):
if not cookies or not cookies.get('userUP'):
self.raise_login_required()
st = int_or_none(st) or int(time.time())
exp = st + 6000
auth = f'st={st}~exp={exp}~acl=/*'
auth += '~hmac=' + hmac.new(self._AKAMAI_ENCRYPTION_KEY, auth.encode(), hashlib.sha256).hexdigest()
response = self._download_json(
f'{self._API_URL_V2}/{path}', video_id, query=query,
if cookies and cookies.get('userUP'):
token = cookies.get('userUP').value
else:
token = self._download_json(
f'{self._API_URL}/um/v3/users',
video_id, note='Downloading token',
data=json.dumps({'device_ids': [{'id': str(uuid.uuid4()), 'type': 'device_id'}]}).encode(),
headers={
'user-agent': 'Disney+;in.startv.hotstar.dplus.tv/23.08.14.4.2915 (Android/13)',
'hotstarauth': auth,
'x-hs-usertoken': cookies['userUP'].value,
'x-hs-device-id': traverse_obj(cookies, ('deviceId', 'value')) or str(uuid.uuid4()),
'x-hs-client': 'platform:androidtv;app_id:in.startv.hotstar.dplus.tv;app_version:23.08.14.4;os:Android;os_version:13;schema_version:0.0.970',
'x-hs-platform': 'androidtv',
'content-type': 'application/json',
'x-hs-platform': 'PCTV', # or 'web'
'Content-Type': 'application/json',
})['user_identity']
response = self._download_json(
f'{self._API_URL}/{path}', video_id, query=query,
headers={
'hotstarauth': auth,
'x-hs-appversion': '6.72.2',
'x-hs-platform': 'web',
'x-hs-usertoken': token,
})
if not traverse_obj(response, ('success', {dict})):
raise ExtractorError('API call was unsuccessful')
return response['success']
if response['message'] != "Playback URL's fetched successfully":
raise ExtractorError(
response['message'], expected=True)
return response['data']
def _call_api_v2(self, path, video_id, content_type, cookies=None, st=None):
return self._call_api_impl(f'{path}', video_id, query={
'content_id': video_id,
'filters': f'content_type={content_type}',
'client_capabilities': json.dumps({
'package': ['dash', 'hls'],
'container': ['fmp4br', 'fmp4'],
'ads': ['non_ssai', 'ssai'],
'audio_channel': ['atmos', 'dolby51', 'stereo'],
'encryption': ['plain', 'widevine'], # wv only so we can raise appropriate error
'video_codec': ['h265', 'h264'],
'ladder': ['tv', 'full'],
'resolution': ['4k', 'hd'],
'true_resolution': ['4k', 'hd'],
'dynamic_range': ['hdr', 'sdr'],
}, separators=(',', ':')),
'drm_parameters': json.dumps({
'widevine_security_level': ['SW_SECURE_DECODE', 'SW_SECURE_CRYPTO'],
'hdcp_version': ['HDCP_V2_2', 'HDCP_V2_1', 'HDCP_V2', 'HDCP_V1'],
}, separators=(',', ':')),
}, st=st, cookies=cookies)
@staticmethod
def _parse_metadata_v1(video_data):
return traverse_obj(video_data, {
'id': ('contentId', {str}),
'title': ('title', {str}),
'description': ('description', {str}),
'duration': ('duration', {int_or_none}),
'timestamp': (('broadcastDate', 'startDate'), {int_or_none}, any),
'release_year': ('year', {int_or_none}),
'channel': ('channelName', {str}),
'channel_id': ('channelId', {int}, {str_or_none}),
'series': ('showName', {str}),
'season': ('seasonName', {str}),
'season_number': ('seasonNo', {int_or_none}),
'season_id': ('seasonId', {int}, {str_or_none}),
'episode': ('title', {str}),
'episode_number': ('episodeNo', {int_or_none}),
def _call_api_v2(self, path, video_id, st=None, cookies=None):
return self._call_api_impl(
f'{path}/content/{video_id}', video_id, st=st, cookies=cookies, query={
'desired-config': 'audio_channel:stereo|container:fmp4|dynamic_range:hdr|encryption:plain|ladder:tv|package:dash|resolution:fhd|subs-tag:HotstarVIP|video_codec:h265',
'device-id': cookies.get('device_id').value if cookies.get('device_id') else str(uuid.uuid4()),
'os-name': 'Windows',
'os-version': '10',
})
def _fetch_page(self, path, item_id, name, query, root, page):
results = self._call_api_v1(
path, item_id, note=f'Downloading {name} page {page + 1} JSON', query={
**query,
'tao': page * self._PAGE_SIZE,
'tas': self._PAGE_SIZE,
})['body']['results']
for video in traverse_obj(results, (('assets', None), 'items', lambda _, v: v['contentId'])):
def _playlist_entries(self, path, item_id, root=None, **kwargs):
results = self._call_api_v1(path, item_id, **kwargs)['body']['results']
for video in traverse_obj(results, (('assets', None), 'items', ...)):
if video.get('contentId'):
yield self.url_result(
HotStarIE._video_url(video['contentId'], root=root), HotStarIE, **self._parse_metadata_v1(video))
HotStarIE._video_url(video['contentId'], root=root), HotStarIE, video['contentId'])
class HotStarIE(HotStarBaseIE):
IE_NAME = 'hotstar'
IE_DESC = 'JioHotstar'
_VALID_URL = r'''(?x)
https?://(?:www\.)?hotstar\.com(?:/in)?/(?!in/)
(?:
@ -146,16 +114,15 @@ class HotStarIE(HotStarBaseIE):
'upload_date': '20190501',
'duration': 1219,
'channel': 'StarPlus',
'channel_id': '821',
'channel_id': '3',
'series': 'Ek Bhram - Sarvagun Sampanna',
'season': 'Chapter 1',
'season_number': 1,
'season_id': '1260004607',
'season_id': '6771',
'episode': 'Janhvi Targets Suman',
'episode_number': 8,
},
'params': {'skip_download': 'm3u8'},
}, { # Metadata call gets HTTP Error 504 with tas=10000
}, {
'url': 'https://www.hotstar.com/in/shows/anupama/1260022017/anupama-anuj-share-a-moment/1000282843',
'info_dict': {
'id': '1000282843',
@ -167,14 +134,14 @@ class HotStarIE(HotStarBaseIE):
'channel': 'StarPlus',
'series': 'Anupama',
'season_number': 1,
'season_id': '1260022018',
'season_id': '7399',
'upload_date': '20230307',
'episode': 'Anupama, Anuj Share a Moment',
'episode_number': 853,
'duration': 1266,
'channel_id': '821',
'duration': 1272,
'channel_id': '3',
},
'params': {'skip_download': 'm3u8'},
'skip': 'HTTP Error 504: Gateway Time-out', # XXX: Investigate 504 errors on some episodes
}, {
'url': 'https://www.hotstar.com/in/shows/kana-kaanum-kaalangal/1260097087/back-to-school/1260097320',
'info_dict': {
@ -187,15 +154,14 @@ class HotStarIE(HotStarBaseIE):
'channel': 'Hotstar Specials',
'series': 'Kana Kaanum Kaalangal',
'season_number': 1,
'season_id': '1260097089',
'season_id': '9441',
'upload_date': '20220421',
'episode': 'Back To School',
'episode_number': 1,
'duration': 1810,
'channel_id': '1260003991',
'channel_id': '54',
},
'params': {'skip_download': 'm3u8'},
}, { # Metadata call gets HTTP Error 504 with tas=10000
}, {
'url': 'https://www.hotstar.com/in/clips/e3-sairat-kahani-pyaar-ki/1000262286',
'info_dict': {
'id': '1000262286',
@ -207,7 +173,6 @@ class HotStarIE(HotStarBaseIE):
'timestamp': 1622943900,
'duration': 5395,
},
'params': {'skip_download': 'm3u8'},
}, {
'url': 'https://www.hotstar.com/in/movies/premam/1000091195',
'info_dict': {
@ -215,13 +180,12 @@ class HotStarIE(HotStarBaseIE):
'ext': 'mp4',
'title': 'Premam',
'release_year': 2015,
'description': 'md5:096cd8aaae8dab56524823dc19dfa9f7',
'description': 'md5:d833c654e4187b5e34757eafb5b72d7f',
'timestamp': 1462149000,
'upload_date': '20160502',
'episode': 'Premam',
'duration': 8994,
},
'params': {'skip_download': 'm3u8'},
}, {
'url': 'https://www.hotstar.com/movies/radha-gopalam/1000057157',
'only_matching': True,
@ -244,13 +208,6 @@ class HotStarIE(HotStarBaseIE):
None: 'content',
}
_CONTENT_TYPE = {
'movie': 'MOVIE',
'episode': 'EPISODE',
'match': 'SPORT',
'content': 'CLIPS',
}
_IGNORE_MAP = {
'res': 'resolution',
'vcodec': 'video_codec',
@ -272,48 +229,38 @@ class HotStarIE(HotStarBaseIE):
def _real_extract(self, url):
video_id, video_type = self._match_valid_url(url).group('id', 'type')
video_type = self._TYPE[video_type]
video_type = self._TYPE.get(video_type, video_type)
cookies = self._get_cookies(url) # Cookies before any request
video_data = traverse_obj(
self._call_api_v1(f'{video_type}/detail', video_id, fatal=False, query={
'tas': 5, # See https://github.com/yt-dlp/yt-dlp/issues/7946
'contentId': video_id,
}), ('body', 'results', 'item', {dict})) or {}
if video_data.get('drmProtected'):
self._call_api_v1(
f'{video_type}/detail', video_id, fatal=False, query={'tas': 10000, 'contentId': video_id}),
('body', 'results', 'item', {dict})) or {}
if not self.get_param('allow_unplayable_formats') and video_data.get('drmProtected'):
self.report_drm(video_id)
geo_restricted = False
formats, subs, has_drm = [], {}, False
headers = {'Referer': f'{self._BASE_URL}/in'}
content_type = traverse_obj(video_data, ('contentType', {str})) or self._CONTENT_TYPE[video_type]
# See https://github.com/yt-dlp/yt-dlp/issues/396
st = self._request_webpage(
f'{self._BASE_URL}/in', video_id, 'Fetching server time').get_header('x-origin-date')
watch = self._call_api_v2('pages/watch', video_id, content_type, cookies=cookies, st=st)
player_config = traverse_obj(watch, (
'page', 'spaces', 'player', 'widget_wrappers', lambda _, v: v['template'] == 'PlayerWidget',
'widget', 'data', 'player_config', {dict}, any, {require('player config')}))
st = self._download_webpage_handle(f'{self._BASE_URL}/in', video_id)[1].headers.get('x-origin-date')
for playback_set in traverse_obj(player_config, (
('media_asset', 'media_asset_v2'),
('primary', 'fallback'),
all, lambda _, v: url_or_none(v['content_url']),
)):
tags = str_or_none(playback_set.get('playback_tags')) or ''
geo_restricted = False
formats, subs = [], {}
headers = {'Referer': f'{self._BASE_URL}/in'}
# change to v2 in the future
playback_sets = self._call_api_v2('play/v1/playback', video_id, st=st, cookies=cookies)['playBackSets']
for playback_set in playback_sets:
if not isinstance(playback_set, dict):
continue
tags = str_or_none(playback_set.get('tagsCombination')) or ''
if any(f'{prefix}:{ignore}' in tags
for key, prefix in self._IGNORE_MAP.items()
for ignore in self._configuration_arg(key)):
continue
tag_dict = dict((*t.split(':', 1), None)[:2] for t in tags.split(';'))
if tag_dict.get('encryption') not in ('plain', None):
has_drm = True
format_url = url_or_none(playback_set.get('playbackUrl'))
if not format_url:
continue
format_url = re.sub(r'(?<=//staragvod)(\d)', r'web\1', playback_set['content_url'])
format_url = re.sub(r'(?<=//staragvod)(\d)', r'web\1', format_url)
ext = determine_ext(format_url)
current_formats, current_subs = [], {}
@ -333,12 +280,14 @@ class HotStarIE(HotStarBaseIE):
'height': int_or_none(playback_set.get('height')),
}]
except ExtractorError as e:
if isinstance(e.cause, HTTPError) and e.cause.status in (403, 474):
if isinstance(e.cause, HTTPError) and e.cause.status == 403:
geo_restricted = True
else:
self.write_debug(e)
continue
tag_dict = dict((*t.split(':', 1), None)[:2] for t in tags.split(';'))
if tag_dict.get('encryption') not in ('plain', None):
for f in current_formats:
f['has_drm'] = True
for f in current_formats:
for k, v in self._TAG_FIELDS.items():
if not f.get(k):
@ -350,11 +299,6 @@ class HotStarIE(HotStarBaseIE):
'stereo': 2,
'dolby51': 6,
}.get(tag_dict.get('audio_channel'))
if (
'Audio_Description' in f['format_id']
or 'Audio Description' in (f.get('format_note') or '')
):
f['source_preference'] = -99 + (f.get('source_preference') or -1)
f['format_note'] = join_nonempty(
tag_dict.get('ladder'),
tag_dict.get('audio_channel') if f.get('acodec') != 'none' else None,
@ -366,17 +310,27 @@ class HotStarIE(HotStarBaseIE):
if not formats and geo_restricted:
self.raise_geo_restricted(countries=['IN'], metadata_available=True)
elif not formats and has_drm:
self.report_drm(video_id)
self._remove_duplicate_formats(formats)
for f in formats:
f.setdefault('http_headers', {}).update(headers)
return {
**self._parse_metadata_v1(video_data),
'id': video_id,
'title': video_data.get('title'),
'description': video_data.get('description'),
'duration': int_or_none(video_data.get('duration')),
'timestamp': int_or_none(traverse_obj(video_data, 'broadcastDate', 'startDate')),
'release_year': int_or_none(video_data.get('year')),
'formats': formats,
'subtitles': subs,
'channel': video_data.get('channelName'),
'channel_id': str_or_none(video_data.get('channelId')),
'series': video_data.get('showName'),
'season': video_data.get('seasonName'),
'season_number': int_or_none(video_data.get('seasonNo')),
'season_id': str_or_none(video_data.get('seasonId')),
'episode': video_data.get('title'),
'episode_number': int_or_none(video_data.get('episodeNo')),
}
@ -417,6 +371,64 @@ class HotStarPrefixIE(InfoExtractor):
return self.url_result(HotStarIE._video_url(video_id, video_type), HotStarIE, video_id)
class HotStarPlaylistIE(HotStarBaseIE):
IE_NAME = 'hotstar:playlist'
_VALID_URL = r'https?://(?:www\.)?hotstar\.com(?:/in)?/(?:tv|shows)(?:/[^/]+){2}/list/[^/]+/t-(?P<id>\w+)'
_TESTS = [{
'url': 'https://www.hotstar.com/tv/savdhaan-india/s-26/list/popular-clips/t-3_2_26',
'info_dict': {
'id': '3_2_26',
},
'playlist_mincount': 20,
}, {
'url': 'https://www.hotstar.com/shows/savdhaan-india/s-26/list/popular-clips/t-3_2_26',
'only_matching': True,
}, {
'url': 'https://www.hotstar.com/tv/savdhaan-india/s-26/list/extras/t-2480',
'only_matching': True,
}, {
'url': 'https://www.hotstar.com/in/tv/karthika-deepam/15457/list/popular-clips/t-3_2_1272',
'only_matching': True,
}]
def _real_extract(self, url):
id_ = self._match_id(url)
return self.playlist_result(
self._playlist_entries('tray/find', id_, query={'tas': 10000, 'uqId': id_}), id_)
class HotStarSeasonIE(HotStarBaseIE):
IE_NAME = 'hotstar:season'
_VALID_URL = r'(?P<url>https?://(?:www\.)?hotstar\.com(?:/in)?/(?:tv|shows)/[^/]+/\w+)/seasons/[^/]+/ss-(?P<id>\w+)'
_TESTS = [{
'url': 'https://www.hotstar.com/tv/radhakrishn/1260000646/seasons/season-2/ss-8028',
'info_dict': {
'id': '8028',
},
'playlist_mincount': 35,
}, {
'url': 'https://www.hotstar.com/in/tv/ishqbaaz/9567/seasons/season-2/ss-4357',
'info_dict': {
'id': '4357',
},
'playlist_mincount': 30,
}, {
'url': 'https://www.hotstar.com/in/tv/bigg-boss/14714/seasons/season-4/ss-8208/',
'info_dict': {
'id': '8208',
},
'playlist_mincount': 19,
}, {
'url': 'https://www.hotstar.com/in/shows/bigg-boss/14714/seasons/season-4/ss-8208/',
'only_matching': True,
}]
def _real_extract(self, url):
url, season_id = self._match_valid_url(url).groups()
return self.playlist_result(self._playlist_entries(
'season/asset', season_id, url, query={'tao': 0, 'tas': 0, 'size': 10000, 'id': season_id}), season_id)
class HotStarSeriesIE(HotStarBaseIE):
IE_NAME = 'hotstar:series'
_VALID_URL = r'(?P<url>https?://(?:www\.)?hotstar\.com(?:/in)?/(?:tv|shows)/[^/]+/(?P<id>\d+))/?(?:[#?]|$)'
@ -431,29 +443,25 @@ class HotStarSeriesIE(HotStarBaseIE):
'info_dict': {
'id': '1260050431',
},
'playlist_mincount': 42,
'playlist_mincount': 43,
}, {
'url': 'https://www.hotstar.com/in/tv/mahabharat/435/',
'info_dict': {
'id': '435',
},
'playlist_mincount': 267,
}, { # HTTP Error 504 with tas=10000 (possibly because total size is over 1000 items?)
}, {
'url': 'https://www.hotstar.com/in/shows/anupama/1260022017/',
'info_dict': {
'id': '1260022017',
},
'playlist_mincount': 1601,
'playlist_mincount': 940,
}]
_PAGE_SIZE = 100
def _real_extract(self, url):
url, series_id = self._match_valid_url(url).group('url', 'id')
eid = self._call_api_v1(
url, series_id = self._match_valid_url(url).groups()
id_ = self._call_api_v1(
'show/detail', series_id, query={'contentId': series_id})['body']['results']['item']['id']
entries = OnDemandPagedList(functools.partial(
self._fetch_page, 'tray/g/1/items', series_id,
'series', {'etid': 0, 'eid': eid}, url), self._PAGE_SIZE)
return self.playlist_result(entries, series_id)
return self.playlist_result(self._playlist_entries(
'tray/g/1/items', series_id, url, query={'tao': 0, 'tas': 10000, 'etid': 0, 'eid': id_}), series_id)

View File

@ -7,13 +7,12 @@ import urllib.parse
from .common import InfoExtractor
from ..utils import (
ExtractorError,
clean_html,
int_or_none,
parse_duration,
str_or_none,
try_get,
unescapeHTML,
update_url,
unified_strdate,
update_url_query,
url_or_none,
)
@ -23,8 +22,8 @@ from ..utils.traversal import traverse_obj
class HuyaLiveIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.|m\.)?huya\.com/(?!(?:video/play/))(?P<id>[^/#?&]+)(?:\D|$)'
IE_NAME = 'huya:live'
IE_DESC = '虎牙直播'
_TESTS = [{
IE_DESC = 'huya.com'
TESTS = [{
'url': 'https://www.huya.com/572329',
'info_dict': {
'id': '572329',
@ -150,94 +149,63 @@ class HuyaVideoIE(InfoExtractor):
'id': '1002412640',
'ext': 'mp4',
'title': '8月3日',
'categories': ['主机游戏'],
'duration': 14.0,
'thumbnail': r're:https?://.*\.jpg',
'duration': 14,
'uploader': '虎牙-ATS欧卡车队青木',
'uploader_id': '1564376151',
'upload_date': '20240803',
'view_count': int,
'comment_count': int,
'like_count': int,
'thumbnail': r're:https?://.+\.jpg',
'timestamp': 1722675433,
},
}, {
},
{
'url': 'https://www.huya.com/video/play/556054543.html',
'info_dict': {
'id': '556054543',
'ext': 'mp4',
'title': '我不挑事 也不怕事',
'categories': ['英雄联盟'],
'description': 'md5:58184869687d18ce62dc7b4b2ad21201',
'duration': 1864.0,
'thumbnail': r're:https?://.*\.jpg',
'duration': 1864,
'uploader': '卡尔',
'uploader_id': '367138632',
'upload_date': '20210811',
'view_count': int,
'comment_count': int,
'like_count': int,
'tags': 'count:4',
'thumbnail': r're:https?://.+\.jpg',
'timestamp': 1628675950,
},
}, {
# Only m3u8 available
'url': 'https://www.huya.com/video/play/1063345618.html',
'info_dict': {
'id': '1063345618',
'ext': 'mp4',
'title': '峡谷第一中黑铁上钻石顶级教学对抗elo',
'categories': ['英雄联盟'],
'comment_count': int,
'duration': 21603.0,
'like_count': int,
'thumbnail': r're:https?://.+\.jpg',
'timestamp': 1749668803,
'upload_date': '20250611',
'uploader': '北枫CC',
'uploader_id': '2183525275',
'view_count': int,
},
}]
def _real_extract(self, url: str):
video_id = self._match_id(url)
moment = self._download_json(
'https://liveapi.huya.com/moment/getMomentContent',
video_id, query={'videoId': video_id})['data']['moment']
video_data = self._download_json(
'https://liveapi.huya.com/moment/getMomentContent', video_id,
query={'videoId': video_id})['data']['moment']['videoInfo']
formats = []
for definition in traverse_obj(moment, (
'videoInfo', 'definitions', lambda _, v: url_or_none(v['m3u8']),
)):
fmts = self._extract_m3u8_formats(definition['m3u8'], video_id, 'mp4', fatal=False)
for fmt in fmts:
fmt.update(**traverse_obj(definition, {
'filesize': ('size', {int_or_none}),
for definition in traverse_obj(video_data, ('definitions', lambda _, v: url_or_none(v['url']))):
formats.append({
'url': definition['url'],
**traverse_obj(definition, {
'format_id': ('defName', {str}),
'height': ('height', {int_or_none}),
'quality': ('definition', {int_or_none}),
'width': ('width', {int_or_none}),
}))
formats.extend(fmts)
'height': ('height', {int_or_none}),
'filesize': ('size', {int_or_none}),
}),
})
return {
'id': video_id,
'formats': formats,
**traverse_obj(moment, {
'comment_count': ('commentCount', {int_or_none}),
'description': ('content', {clean_html}, filter),
'like_count': ('favorCount', {int_or_none}),
'timestamp': ('cTime', {int_or_none}),
}),
**traverse_obj(moment, ('videoInfo', {
**traverse_obj(video_data, {
'title': ('videoTitle', {str}),
'categories': ('category', {str}, filter, all, filter),
'thumbnail': ('videoCover', {url_or_none}),
'duration': ('videoDuration', {parse_duration}),
'tags': ('tags', ..., {str}, filter, all, filter),
'thumbnail': (('videoBigCover', 'videoCover'), {url_or_none}, {update_url(query=None)}, any),
'uploader': ('nickName', {str}),
'uploader_id': ('uid', {str_or_none}),
'upload_date': ('videoUploadTime', {unified_strdate}),
'view_count': ('videoPlayNum', {int_or_none}),
})),
'comment_count': ('videoCommentNum', {int_or_none}),
'like_count': ('favorCount', {int_or_none}),
}),
}

View File

@ -1,66 +1,32 @@
from .common import InfoExtractor
from ..utils import (
ExtractorError,
clean_html,
url_or_none,
)
from ..utils.traversal import subs_list_to_dict, traverse_obj
from ..utils import js_to_json, traverse_obj
class MonsterSirenHypergryphMusicIE(InfoExtractor):
IE_NAME = 'monstersiren'
IE_DESC = '塞壬唱片'
_API_BASE = 'https://monster-siren.hypergryph.com/api'
_VALID_URL = r'https?://monster-siren\.hypergryph\.com/music/(?P<id>\d+)'
_TESTS = [{
'url': 'https://monster-siren.hypergryph.com/music/514562',
'info_dict': {
'id': '514562',
'ext': 'wav',
'title': 'Flame Shadow',
'album': 'Flame Shadow',
'artists': ['塞壬唱片-MSR'],
'description': 'md5:19e2acfcd1b65b41b29e8079ab948053',
'thumbnail': r're:https?://web\.hycdn\.cn/siren/pic/.+\.jpg',
},
}, {
'url': 'https://monster-siren.hypergryph.com/music/514518',
'info_dict': {
'id': '514518',
'ext': 'wav',
'title': 'Heavenly Me (Instrumental)',
'album': 'Heavenly Me',
'artists': ['塞壬唱片-MSR', 'AIYUE blessed : 理名'],
'description': 'md5:ce790b41c932d1ad72eb791d1d8ae598',
'thumbnail': r're:https?://web\.hycdn\.cn/siren/pic/.+\.jpg',
'album': 'Flame Shadow',
'title': 'Flame Shadow',
},
}]
def _real_extract(self, url):
audio_id = self._match_id(url)
song = self._download_json(f'{self._API_BASE}/song/{audio_id}', audio_id)
if traverse_obj(song, 'code') != 0:
msg = traverse_obj(song, ('msg', {str}, filter))
raise ExtractorError(
msg or 'API returned an error response', expected=bool(msg))
album = None
if album_id := traverse_obj(song, ('data', 'albumCid', {str})):
album = self._download_json(
f'{self._API_BASE}/album/{album_id}/detail', album_id, fatal=False)
webpage = self._download_webpage(url, audio_id)
json_data = self._search_json(
r'window\.g_initialProps\s*=', webpage, 'data', audio_id, transform_source=js_to_json)
return {
'id': audio_id,
'title': traverse_obj(json_data, ('player', 'songDetail', 'name')),
'url': traverse_obj(json_data, ('player', 'songDetail', 'sourceUrl')),
'ext': 'wav',
'vcodec': 'none',
**traverse_obj(song, ('data', {
'title': ('name', {str}),
'artists': ('artists', ..., {str}),
'subtitles': ({'url': 'lyricUrl'}, all, {subs_list_to_dict(lang='en')}),
'url': ('sourceUrl', {url_or_none}),
})),
**traverse_obj(album, ('data', {
'album': ('name', {str}),
'description': ('intro', {clean_html}),
'thumbnail': ('coverUrl', {url_or_none}),
})),
'artists': traverse_obj(json_data, ('player', 'songDetail', 'artists', ...)),
'album': traverse_obj(json_data, ('musicPlay', 'albumDetail', 'name')),
}

View File

@ -0,0 +1,408 @@
import base64
import itertools
import json
import random
import re
import string
import time
from .common import InfoExtractor
from ..utils import (
ExtractorError,
float_or_none,
int_or_none,
jwt_decode_hs256,
parse_age_limit,
try_call,
url_or_none,
)
from ..utils.traversal import traverse_obj
class JioCinemaBaseIE(InfoExtractor):
_NETRC_MACHINE = 'jiocinema'
_GEO_BYPASS = False
_ACCESS_TOKEN = None
_REFRESH_TOKEN = None
_GUEST_TOKEN = None
_USER_ID = None
_DEVICE_ID = None
_API_HEADERS = {'Origin': 'https://www.jiocinema.com', 'Referer': 'https://www.jiocinema.com/'}
_APP_NAME = {'appName': 'RJIL_JioCinema'}
_APP_VERSION = {'appVersion': '5.0.0'}
_API_SIGNATURES = 'o668nxgzwff'
_METADATA_API_BASE = 'https://content-jiovoot.voot.com/psapi'
_ACCESS_HINT = 'the `accessToken` from your browser local storage'
_LOGIN_HINT = (
'Log in with "-u phone -p <PHONE_NUMBER>" to authenticate with OTP, '
f'or use "-u token -p <ACCESS_TOKEN>" to log in with {_ACCESS_HINT}. '
'If you have previously logged in with yt-dlp and your session '
'has been cached, you can use "-u device -p <DEVICE_ID>"')
def _cache_token(self, token_type):
assert token_type in ('access', 'refresh', 'all')
if token_type in ('access', 'all'):
self.cache.store(
JioCinemaBaseIE._NETRC_MACHINE, f'{JioCinemaBaseIE._DEVICE_ID}-access', JioCinemaBaseIE._ACCESS_TOKEN)
if token_type in ('refresh', 'all'):
self.cache.store(
JioCinemaBaseIE._NETRC_MACHINE, f'{JioCinemaBaseIE._DEVICE_ID}-refresh', JioCinemaBaseIE._REFRESH_TOKEN)
def _call_api(self, url, video_id, note='Downloading API JSON', headers={}, data={}):
return self._download_json(
url, video_id, note, data=json.dumps(data, separators=(',', ':')).encode(), headers={
'Content-Type': 'application/json',
'Accept': 'application/json',
**self._API_HEADERS,
**headers,
}, expected_status=(400, 403, 474))
def _call_auth_api(self, service, endpoint, note, headers={}, data={}):
return self._call_api(
f'https://auth-jiocinema.voot.com/{service}service/apis/v4/{endpoint}',
None, note=note, headers=headers, data=data)
def _refresh_token(self):
if not JioCinemaBaseIE._REFRESH_TOKEN or not JioCinemaBaseIE._DEVICE_ID:
raise ExtractorError('User token has expired', expected=True)
response = self._call_auth_api(
'token', 'refreshtoken', 'Refreshing token',
headers={'accesstoken': self._ACCESS_TOKEN}, data={
**self._APP_NAME,
'deviceId': self._DEVICE_ID,
'refreshToken': self._REFRESH_TOKEN,
**self._APP_VERSION,
})
refresh_token = response.get('refreshTokenId')
if refresh_token and refresh_token != JioCinemaBaseIE._REFRESH_TOKEN:
JioCinemaBaseIE._REFRESH_TOKEN = refresh_token
self._cache_token('refresh')
JioCinemaBaseIE._ACCESS_TOKEN = response['authToken']
self._cache_token('access')
def _fetch_guest_token(self):
JioCinemaBaseIE._DEVICE_ID = ''.join(random.choices(string.digits, k=10))
guest_token = self._call_auth_api(
'token', 'guest', 'Downloading guest token', data={
**self._APP_NAME,
'deviceType': 'phone',
'os': 'ios',
'deviceId': self._DEVICE_ID,
'freshLaunch': False,
'adId': self._DEVICE_ID,
**self._APP_VERSION,
})
self._GUEST_TOKEN = guest_token['authToken']
self._USER_ID = guest_token['userId']
def _call_login_api(self, endpoint, guest_token, data, note):
return self._call_auth_api(
'user', f'loginotp/{endpoint}', note, headers={
**self.geo_verification_headers(),
'accesstoken': self._GUEST_TOKEN,
**self._APP_NAME,
**traverse_obj(guest_token, 'data', {
'deviceType': ('deviceType', {str}),
'os': ('os', {str}),
})}, data=data)
def _is_token_expired(self, token):
return (try_call(lambda: jwt_decode_hs256(token)['exp']) or 0) <= int(time.time() - 180)
def _perform_login(self, username, password):
if self._ACCESS_TOKEN and not self._is_token_expired(self._ACCESS_TOKEN):
return
UUID_RE = r'[\da-f]{8}-(?:[\da-f]{4}-){3}[\da-f]{12}'
if username.lower() == 'token':
if try_call(lambda: jwt_decode_hs256(password)):
JioCinemaBaseIE._ACCESS_TOKEN = password
refresh_hint = 'the `refreshToken` UUID from your browser local storage'
refresh_token = self._configuration_arg('refresh_token', [''], ie_key=JioCinemaIE)[0]
if not refresh_token:
self.to_screen(
'To extend the life of your login session, in addition to your access token, '
'you can pass --extractor-args "jiocinema:refresh_token=REFRESH_TOKEN" '
f'where REFRESH_TOKEN is {refresh_hint}')
elif re.fullmatch(UUID_RE, refresh_token):
JioCinemaBaseIE._REFRESH_TOKEN = refresh_token
else:
self.report_warning(f'Invalid refresh_token value. Use {refresh_hint}')
else:
raise ExtractorError(
f'The password given could not be decoded as a token; use {self._ACCESS_HINT}', expected=True)
elif username.lower() == 'device' and re.fullmatch(rf'(?:{UUID_RE}|\d+)', password):
JioCinemaBaseIE._REFRESH_TOKEN = self.cache.load(JioCinemaBaseIE._NETRC_MACHINE, f'{password}-refresh')
JioCinemaBaseIE._ACCESS_TOKEN = self.cache.load(JioCinemaBaseIE._NETRC_MACHINE, f'{password}-access')
if not JioCinemaBaseIE._REFRESH_TOKEN or not JioCinemaBaseIE._ACCESS_TOKEN:
raise ExtractorError(f'Failed to load cached tokens for device ID "{password}"', expected=True)
elif username.lower() == 'phone' and re.fullmatch(r'\+?\d+', password):
self._fetch_guest_token()
guest_token = jwt_decode_hs256(self._GUEST_TOKEN)
initial_data = {
'number': base64.b64encode(password.encode()).decode(),
**self._APP_VERSION,
}
response = self._call_login_api('send', guest_token, initial_data, 'Requesting OTP')
if not traverse_obj(response, ('OTPInfo', {dict})):
raise ExtractorError('There was a problem with the phone number login attempt')
is_iphone = guest_token.get('os') == 'ios'
response = self._call_login_api('verify', guest_token, {
'deviceInfo': {
'consumptionDeviceName': 'iPhone' if is_iphone else 'Android',
'info': {
'platform': {'name': 'iPhone OS' if is_iphone else 'Android'},
'androidId': self._DEVICE_ID,
'type': 'iOS' if is_iphone else 'Android',
},
},
**initial_data,
'otp': self._get_tfa_info('the one-time password sent to your phone'),
}, 'Submitting OTP')
if traverse_obj(response, 'code') == 1043:
raise ExtractorError('Wrong OTP', expected=True)
JioCinemaBaseIE._REFRESH_TOKEN = response['refreshToken']
JioCinemaBaseIE._ACCESS_TOKEN = response['authToken']
else:
raise ExtractorError(self._LOGIN_HINT, expected=True)
user_token = jwt_decode_hs256(JioCinemaBaseIE._ACCESS_TOKEN)['data']
JioCinemaBaseIE._USER_ID = user_token['userId']
JioCinemaBaseIE._DEVICE_ID = user_token['deviceId']
if JioCinemaBaseIE._REFRESH_TOKEN and username != 'device':
self._cache_token('all')
if self.get_param('cachedir') is not False:
self.to_screen(
f'NOTE: For subsequent logins you can use "-u device -p {JioCinemaBaseIE._DEVICE_ID}"')
elif not JioCinemaBaseIE._REFRESH_TOKEN:
JioCinemaBaseIE._REFRESH_TOKEN = self.cache.load(
JioCinemaBaseIE._NETRC_MACHINE, f'{JioCinemaBaseIE._DEVICE_ID}-refresh')
if JioCinemaBaseIE._REFRESH_TOKEN:
self._cache_token('access')
self.to_screen(f'Logging in as device ID "{JioCinemaBaseIE._DEVICE_ID}"')
if self._is_token_expired(JioCinemaBaseIE._ACCESS_TOKEN):
self._refresh_token()
class JioCinemaIE(JioCinemaBaseIE):
IE_NAME = 'jiocinema'
_VALID_URL = r'https?://(?:www\.)?jiocinema\.com/?(?:movies?/[^/?#]+/|tv-shows/(?:[^/?#]+/){3})(?P<id>\d{3,})'
_TESTS = [{
'url': 'https://www.jiocinema.com/tv-shows/agnisakshi-ek-samjhauta/1/pradeep-to-stop-the-wedding/3759931',
'info_dict': {
'id': '3759931',
'ext': 'mp4',
'title': 'Pradeep to stop the wedding?',
'description': 'md5:75f72d1d1a66976633345a3de6d672b1',
'episode': 'Pradeep to stop the wedding?',
'episode_number': 89,
'season': 'Agnisakshi…Ek Samjhauta-S1',
'season_number': 1,
'series': 'Agnisakshi Ek Samjhauta',
'duration': 1238.0,
'thumbnail': r're:https?://.+\.jpg',
'age_limit': 13,
'season_id': '3698031',
'upload_date': '20230606',
'timestamp': 1686009600,
'release_date': '20230607',
'genres': ['Drama'],
},
'params': {'skip_download': 'm3u8'},
}, {
'url': 'https://www.jiocinema.com/movies/bhediya/3754021/watch',
'info_dict': {
'id': '3754021',
'ext': 'mp4',
'title': 'Bhediya',
'description': 'md5:a6bf2900371ac2fc3f1447401a9f7bb0',
'episode': 'Bhediya',
'duration': 8500.0,
'thumbnail': r're:https?://.+\.jpg',
'age_limit': 13,
'upload_date': '20230525',
'timestamp': 1685026200,
'release_date': '20230524',
'genres': ['Comedy'],
},
'params': {'skip_download': 'm3u8'},
}]
def _extract_formats_and_subtitles(self, playback, video_id):
m3u8_url = traverse_obj(playback, (
'data', 'playbackUrls', lambda _, v: v['streamtype'] == 'hls', 'url', {url_or_none}, any))
if not m3u8_url: # DRM-only content only serves dash urls
self.report_drm(video_id)
formats, subtitles = self._extract_m3u8_formats_and_subtitles(m3u8_url, video_id, m3u8_id='hls')
self._remove_duplicate_formats(formats)
return {
# '/_definst_/smil:vod/' m3u8 manifests claim to have 720p+ formats but max out at 480p
'formats': traverse_obj(formats, (
lambda _, v: '/_definst_/smil:vod/' not in v['url'] or v['height'] <= 480)),
'subtitles': subtitles,
}
def _real_extract(self, url):
video_id = self._match_id(url)
if not self._ACCESS_TOKEN and self._is_token_expired(self._GUEST_TOKEN):
self._fetch_guest_token()
elif self._ACCESS_TOKEN and self._is_token_expired(self._ACCESS_TOKEN):
self._refresh_token()
playback = self._call_api(
f'https://apis-jiovoot.voot.com/playbackjv/v3/{video_id}', video_id,
'Downloading playback JSON', headers={
**self.geo_verification_headers(),
'accesstoken': self._ACCESS_TOKEN or self._GUEST_TOKEN,
**self._APP_NAME,
'deviceid': self._DEVICE_ID,
'uniqueid': self._USER_ID,
'x-apisignatures': self._API_SIGNATURES,
'x-platform': 'androidweb',
'x-platform-token': 'web',
}, data={
'4k': False,
'ageGroup': '18+',
'appVersion': '3.4.0',
'bitrateProfile': 'xhdpi',
'capability': {
'drmCapability': {
'aesSupport': 'yes',
'fairPlayDrmSupport': 'none',
'playreadyDrmSupport': 'none',
'widevineDRMSupport': 'none',
},
'frameRateCapability': [{
'frameRateSupport': '30fps',
'videoQuality': '1440p',
}],
},
'continueWatchingRequired': False,
'dolby': False,
'downloadRequest': False,
'hevc': False,
'kidsSafe': False,
'manufacturer': 'Windows',
'model': 'Windows',
'multiAudioRequired': True,
'osVersion': '10',
'parentalPinValid': True,
'x-apisignatures': self._API_SIGNATURES,
})
status_code = traverse_obj(playback, ('code', {int}))
if status_code == 474:
self.raise_geo_restricted(countries=['IN'])
elif status_code == 1008:
error_msg = 'This content is only available for premium users'
if self._ACCESS_TOKEN:
raise ExtractorError(error_msg, expected=True)
self.raise_login_required(f'{error_msg}. {self._LOGIN_HINT}', method=None)
elif status_code == 400:
raise ExtractorError('The requested content is not available', expected=True)
elif status_code is not None and status_code != 200:
raise ExtractorError(
f'JioCinema says: {traverse_obj(playback, ("message", {str})) or status_code}')
metadata = self._download_json(
f'{self._METADATA_API_BASE}/voot/v1/voot-web/content/query/asset-details',
video_id, fatal=False, query={
'ids': f'include:{video_id}',
'responseType': 'common',
'devicePlatformType': 'desktop',
})
return {
'id': video_id,
'http_headers': self._API_HEADERS,
**self._extract_formats_and_subtitles(playback, video_id),
**traverse_obj(playback, ('data', {
# fallback metadata
'title': ('name', {str}),
'description': ('fullSynopsis', {str}),
'series': ('show', 'name', {str}, filter),
'season': ('tournamentName', {str}, {lambda x: x if x != 'Season 0' else None}),
'season_number': ('episode', 'season', {int_or_none}, filter),
'episode': ('fullTitle', {str}),
'episode_number': ('episode', 'episodeNo', {int_or_none}, filter),
'age_limit': ('ageNemonic', {parse_age_limit}),
'duration': ('totalDuration', {float_or_none}),
'thumbnail': ('images', {url_or_none}),
})),
**traverse_obj(metadata, ('result', 0, {
'title': ('fullTitle', {str}),
'description': ('fullSynopsis', {str}),
'series': ('showName', {str}, filter),
'season': ('seasonName', {str}, filter),
'season_number': ('season', {int_or_none}),
'season_id': ('seasonId', {str}, filter),
'episode': ('fullTitle', {str}),
'episode_number': ('episode', {int_or_none}),
'timestamp': ('uploadTime', {int_or_none}),
'release_date': ('telecastDate', {str}),
'age_limit': ('ageNemonic', {parse_age_limit}),
'duration': ('duration', {float_or_none}),
'genres': ('genres', ..., {str}),
'thumbnail': ('seo', 'ogImage', {url_or_none}),
})),
}
class JioCinemaSeriesIE(JioCinemaBaseIE):
IE_NAME = 'jiocinema:series'
_VALID_URL = r'https?://(?:www\.)?jiocinema\.com/tv-shows/(?P<slug>[\w-]+)/(?P<id>\d{3,})'
_TESTS = [{
'url': 'https://www.jiocinema.com/tv-shows/naagin/3499917',
'info_dict': {
'id': '3499917',
'title': 'naagin',
},
'playlist_mincount': 120,
}, {
'url': 'https://www.jiocinema.com/tv-shows/mtv-splitsvilla-x5/3499820',
'info_dict': {
'id': '3499820',
'title': 'mtv-splitsvilla-x5',
},
'playlist_mincount': 310,
}]
def _entries(self, series_id):
seasons = traverse_obj(self._download_json(
f'{self._METADATA_API_BASE}/voot/v1/voot-web/view/show/{series_id}', series_id,
'Downloading series metadata JSON', query={'responseType': 'common'}), (
'trays', lambda _, v: v['trayId'] == 'season-by-show-multifilter',
'trayTabs', lambda _, v: v['id']))
for season_num, season in enumerate(seasons, start=1):
season_id = season['id']
label = season.get('label') or season_num
for page_num in itertools.count(1):
episodes = traverse_obj(self._download_json(
f'{self._METADATA_API_BASE}/voot/v1/voot-web/content/generic/series-wise-episode',
season_id, f'Downloading season {label} page {page_num} JSON', query={
'sort': 'episode:asc',
'id': season_id,
'responseType': 'common',
'page': page_num,
}), ('result', lambda _, v: v['id'] and url_or_none(v['slug'])))
if not episodes:
break
for episode in episodes:
yield self.url_result(
episode['slug'], JioCinemaIE, **traverse_obj(episode, {
'video_id': 'id',
'video_title': ('fullTitle', {str}),
'season_number': ('season', {int_or_none}),
'episode_number': ('episode', {int_or_none}),
}))
def _real_extract(self, url):
slug, series_id = self._match_valid_url(url).group('slug', 'id')
return self.playlist_result(self._entries(series_id), series_id, slug)

View File

@ -1,12 +1,12 @@
import functools
import urllib.parse
from .common import InfoExtractor
from ..networking import HEADRequest
from ..utils import (
UserNotLive,
determine_ext,
float_or_none,
int_or_none,
merge_dicts,
parse_iso8601,
str_or_none,
traverse_obj,
@ -16,17 +16,21 @@ from ..utils import (
class KickBaseIE(InfoExtractor):
@functools.cached_property
def _api_headers(self):
token = traverse_obj(
self._get_cookies('https://kick.com/'),
('session_token', 'value', {urllib.parse.unquote}))
return {'Authorization': f'Bearer {token}'} if token else {}
def _real_initialize(self):
self._request_webpage(
HEADRequest('https://kick.com/'), None, 'Setting up session', fatal=False, impersonate=True)
xsrf_token = self._get_cookies('https://kick.com/').get('XSRF-TOKEN')
if not xsrf_token:
self.write_debug('kick.com did not set XSRF-TOKEN cookie')
KickBaseIE._API_HEADERS = {
'Authorization': f'Bearer {xsrf_token.value}',
'X-XSRF-TOKEN': xsrf_token.value,
} if xsrf_token else {}
def _call_api(self, path, display_id, note='Downloading API JSON', headers={}, **kwargs):
return self._download_json(
f'https://kick.com/api/{path}', display_id, note=note,
headers={**self._api_headers, **headers}, impersonate=True, **kwargs)
headers=merge_dicts(headers, self._API_HEADERS), impersonate=True, **kwargs)
class KickIE(KickBaseIE):

View File

@ -167,11 +167,11 @@ class LSMLTVEmbedIE(InfoExtractor):
'duration': 1442,
'upload_date': '20231121',
'title': 'D23-6000-105_cetstud',
'thumbnail': 'https://store.bstrm.net/tmsp00060/assets/media/660858/placeholder1700589200.jpg',
'thumbnail': 'https://store.cloudycdn.services/tmsp00060/assets/media/660858/placeholder1700589200.jpg',
},
}, {
'url': 'https://ltv.lsm.lv/embed?enablesdkjs=1&c=eyJpdiI6IncwVzZmUFk2MU12enVWK1I3SUcwQ1E9PSIsInZhbHVlIjoid3FhV29vamc3T2sxL1RaRmJ5Rm1GTXozU0o2dVczdUtLK0cwZEZJMDQ2a3ZIRG5DK2pneGlnbktBQy9uazVleHN6VXhxdWIweWNvcHRDSnlISlNYOHlVZ1lpcTUrcWZSTUZPQW14TVdkMW9aOUtRWVNDcFF4eWpHNGcrT0VZbUNFQStKQk91cGpndW9FVjJIa0lpbkh3PT0iLCJtYWMiOiIyZGI1NDJlMWRlM2QyMGNhOGEwYTM2MmNlN2JlOGRhY2QyYjdkMmEzN2RlOTEzYTVkNzI1ODlhZDlhZjU4MjQ2IiwidGFnIjoiIn0=',
'md5': 'f236cef2fd5953612754e4e66be51e7a',
'md5': 'a1711e190fe680fdb68fd8413b378e87',
'info_dict': {
'id': 'wUnFArIPDSY',
'ext': 'mp4',
@ -198,8 +198,6 @@ class LSMLTVEmbedIE(InfoExtractor):
'uploader_url': 'https://www.youtube.com/@LTV16plus',
'like_count': int,
'description': 'md5:7ff0c42ba971e3c13e4b8a2ff03b70b5',
'media_type': 'livestream',
'timestamp': 1652550741,
},
}]
@ -210,7 +208,7 @@ class LSMLTVEmbedIE(InfoExtractor):
r'window\.ltvEmbedPayload\s*=', webpage, 'embed json', video_id)
embed_type = traverse_obj(data, ('source', 'name', {str}))
if embed_type in ('backscreen', 'telia'): # 'telia' only for backwards compat
if embed_type == 'telia':
ie_key = 'CloudyCDN'
embed_url = traverse_obj(data, ('source', 'embed_url', {url_or_none}))
elif embed_type == 'youtube':
@ -228,9 +226,9 @@ class LSMLTVEmbedIE(InfoExtractor):
class LSMReplayIE(InfoExtractor):
_VALID_URL = r'https?://replay\.lsm\.lv/[^/?#]+/(?:skaties/|klausies/)?(?:ieraksts|statja)/[^/?#]+/(?P<id>\d+)'
_VALID_URL = r'https?://replay\.lsm\.lv/[^/?#]+/(?:ieraksts|statja)/[^/?#]+/(?P<id>\d+)'
_TESTS = [{
'url': 'https://replay.lsm.lv/lv/skaties/ieraksts/ltv/311130/4-studija-zolitudes-tragedija-un-incupes-stacija',
'url': 'https://replay.lsm.lv/lv/ieraksts/ltv/311130/4-studija-zolitudes-tragedija-un-incupes-stacija',
'md5': '64f72a360ca530d5ed89c77646c9eee5',
'info_dict': {
'id': '46k_d23-6000-105',
@ -243,23 +241,20 @@ class LSMReplayIE(InfoExtractor):
'thumbnail': 'https://ltv.lsm.lv/storage/media/8/7/large/5/1f9604e1.jpg',
},
}, {
'url': 'https://replay.lsm.lv/lv/klausies/ieraksts/lr/183522/138-nepilniga-kompensejamo-zalu-sistema-pat-menesiem-dzena-pacientus-pa-aptiekam',
'md5': '84feb80fd7e6ec07744726a9f01cda4d',
'url': 'https://replay.lsm.lv/lv/ieraksts/lr/183522/138-nepilniga-kompensejamo-zalu-sistema-pat-menesiem-dzena-pacientus-pa-aptiekam',
'md5': '719b33875cd1429846eeeaeec6df2830',
'info_dict': {
'id': '183522',
'ext': 'm4a',
'id': 'a342781',
'ext': 'mp3',
'duration': 1823,
'title': '#138 Nepilnīgā kompensējamo zāļu sistēma pat mēnešiem dzenā pacientus pa aptiekām',
'thumbnail': 'https://pic.latvijasradio.lv/public/assets/media/9/d/large_fd4675ac.jpg',
'upload_date': '20231102',
'timestamp': 1698913860,
'timestamp': 1698921060,
'description': 'md5:7bac3b2dd41e44325032943251c357b1',
},
}, {
'url': 'https://replay.lsm.lv/ru/skaties/statja/ltv/355067/v-kengaragse-nacalas-ukladka-relsov',
'only_matching': True,
}, {
'url': 'https://replay.lsm.lv/lv/ieraksts/ltv/311130/4-studija-zolitudes-tragedija-un-incupes-stacija',
'url': 'https://replay.lsm.lv/ru/statja/ltv/311130/4-studija-zolitudes-tragedija-un-incupes-stacija',
'only_matching': True,
}]
@ -272,24 +267,12 @@ class LSMReplayIE(InfoExtractor):
data = self._search_nuxt_data(
self._fix_nuxt_data(webpage), video_id, context_name='__REPLAY__')
playback_type = data['playback']['type']
if playback_type == 'playable_audio_lr':
playback_data = {
'formats': self._extract_m3u8_formats(data['playback']['service']['hls_url'], video_id),
}
elif playback_type == 'embed':
playback_data = {
'_type': 'url_transparent',
'url': data['playback']['service']['url'],
}
else:
raise ExtractorError(f'Unsupported playback type "{playback_type}"')
return {
'_type': 'url_transparent',
'id': video_id,
**playback_data,
**traverse_obj(data, {
'url': ('playback', 'service', 'url', {url_or_none}),
'title': ('mediaItem', 'title'),
'description': ('mediaItem', ('lead', 'body')),
'duration': ('mediaItem', 'duration', {int_or_none}),

View File

@ -1,107 +0,0 @@
import re
from .common import InfoExtractor
from ..utils import (
clean_html,
int_or_none,
parse_iso8601,
urljoin,
)
from ..utils.traversal import require, traverse_obj
class MaveIE(InfoExtractor):
_VALID_URL = r'https?://(?P<channel>[\w-]+)\.mave\.digital/(?P<id>ep-\d+)'
_TESTS = [{
'url': 'https://ochenlichnoe.mave.digital/ep-25',
'md5': 'aa3e513ef588b4366df1520657cbc10c',
'info_dict': {
'id': '4035f587-914b-44b6-aa5a-d76685ad9bc2',
'ext': 'mp3',
'display_id': 'ochenlichnoe-ep-25',
'title': 'Между мной и миром: психология самооценки',
'description': 'md5:4b7463baaccb6982f326bce5c700382a',
'uploader': 'Самарский университет',
'channel': 'Очень личное',
'channel_id': 'ochenlichnoe',
'channel_url': 'https://ochenlichnoe.mave.digital/',
'view_count': int,
'like_count': int,
'dislike_count': int,
'duration': 3744,
'thumbnail': r're:https://.+/storage/podcasts/.+\.jpg',
'series': 'Очень личное',
'series_id': '2e0c3749-6df2-4946-82f4-50691419c065',
'season': 'Season 3',
'season_number': 3,
'episode': 'Episode 3',
'episode_number': 3,
'timestamp': 1747817300,
'upload_date': '20250521',
},
}, {
'url': 'https://budem.mave.digital/ep-12',
'md5': 'e1ce2780fcdb6f17821aa3ca3e8c919f',
'info_dict': {
'id': '41898bb5-ff57-4797-9236-37a8e537aa21',
'ext': 'mp3',
'display_id': 'budem-ep-12',
'title': 'Екатерина Михайлова: "Горе от ума" не про женщин написана',
'description': 'md5:fa3bdd59ee829dfaf16e3efcb13f1d19',
'uploader': 'Полина Цветкова+Евгения Акопова',
'channel': 'Все там будем',
'channel_id': 'budem',
'channel_url': 'https://budem.mave.digital/',
'view_count': int,
'like_count': int,
'dislike_count': int,
'age_limit': 18,
'duration': 3664,
'thumbnail': r're:https://.+/storage/podcasts/.+\.jpg',
'series': 'Все там будем',
'series_id': 'fe9347bf-c009-4ebd-87e8-b06f2f324746',
'season': 'Season 2',
'season_number': 2,
'episode': 'Episode 5',
'episode_number': 5,
'timestamp': 1735538400,
'upload_date': '20241230',
},
}]
_API_BASE_URL = 'https://api.mave.digital/'
def _real_extract(self, url):
channel_id, slug = self._match_valid_url(url).group('channel', 'id')
display_id = f'{channel_id}-{slug}'
webpage = self._download_webpage(url, display_id)
data = traverse_obj(
self._search_nuxt_json(webpage, display_id),
('data', lambda _, v: v['activeEpisodeData'], any, {require('podcast data')}))
return {
'display_id': display_id,
'channel_id': channel_id,
'channel_url': f'https://{channel_id}.mave.digital/',
'vcodec': 'none',
'thumbnail': re.sub(r'_\d+(?=\.(?:jpg|png))', '', self._og_search_thumbnail(webpage, default='')) or None,
**traverse_obj(data, ('activeEpisodeData', {
'url': ('audio', {urljoin(self._API_BASE_URL)}),
'id': ('id', {str}),
'title': ('title', {str}),
'description': ('description', {clean_html}),
'duration': ('duration', {int_or_none}),
'season_number': ('season', {int_or_none}),
'episode_number': ('number', {int_or_none}),
'view_count': ('listenings', {int_or_none}),
'like_count': ('reactions', lambda _, v: v['type'] == 'like', 'count', {int_or_none}, any),
'dislike_count': ('reactions', lambda _, v: v['type'] == 'dislike', 'count', {int_or_none}, any),
'age_limit': ('is_explicit', {bool}, {lambda x: 18 if x else None}),
'timestamp': ('publish_date', {parse_iso8601}),
})),
**traverse_obj(data, ('podcast', 'podcast', {
'series_id': ('id', {str}),
'series': ('title', {str}),
'channel': ('title', {str}),
'uploader': ('author', {str}),
})),
}

View File

@ -4,15 +4,16 @@ import itertools
import json
import re
import time
import urllib.parse
from .common import InfoExtractor, SearchInfoExtractor
from ..networking import Request
from ..networking.exceptions import HTTPError
from ..utils import (
ExtractorError,
OnDemandPagedList,
clean_html,
determine_ext,
extract_attributes,
float_or_none,
int_or_none,
parse_bitrate,
@ -21,8 +22,9 @@ from ..utils import (
parse_qs,
parse_resolution,
qualities,
remove_start,
str_or_none,
truncate_string,
unescapeHTML,
unified_timestamp,
update_url_query,
url_basename,
@ -30,11 +32,7 @@ from ..utils import (
urlencode_postdata,
urljoin,
)
from ..utils.traversal import (
find_element,
require,
traverse_obj,
)
from ..utils.traversal import find_element, require, traverse_obj
class NiconicoBaseIE(InfoExtractor):
@ -808,39 +806,41 @@ class NiconicoLiveIE(NiconicoBaseIE):
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id, expected_status=404)
if err_msg := traverse_obj(webpage, ({find_element(cls='message')}, {clean_html})):
raise ExtractorError(err_msg, expected=True)
webpage, urlh = self._download_webpage_handle(f'https://live.nicovideo.jp/watch/{video_id}', video_id)
embedded_data = traverse_obj(webpage, (
{find_element(tag='script', id='embedded-data', html=True)},
{extract_attributes}, 'data-props', {json.loads}))
frontend_id = traverse_obj(embedded_data, ('site', 'frontendId', {str_or_none}), default='9')
embedded_data = self._parse_json(unescapeHTML(self._search_regex(
r'<script\s+id="embedded-data"\s*data-props="(.+?)"', webpage, 'embedded data')), video_id)
ws_url = traverse_obj(embedded_data, ('site', 'relive', 'webSocketUrl'))
if not ws_url:
raise ExtractorError('The live hasn\'t started yet or already ended.', expected=True)
ws_url = update_url_query(ws_url, {
'frontend_id': traverse_obj(embedded_data, ('site', 'frontendId')) or '9',
})
hostname = remove_start(urllib.parse.urlparse(urlh.url).hostname, 'sp.')
ws_url = traverse_obj(embedded_data, (
'site', 'relive', 'webSocketUrl', {url_or_none}, {require('websocket URL')}))
ws_url = update_url_query(ws_url, {'frontend_id': frontend_id})
ws = self._request_webpage(
ws_url, video_id, 'Connecting to WebSocket server',
headers={'Origin': 'https://live.nicovideo.jp'})
Request(ws_url, headers={'Origin': f'https://{hostname}'}),
video_id=video_id, note='Connecting to WebSocket server')
self.write_debug('Sending HLS server request')
ws.send(json.dumps({
'type': 'startWatching',
'data': {
'reconnect': False,
'room': {
'commentable': True,
'protocol': 'webSocket',
},
'stream': {
'quality': 'abr',
'protocol': 'hls',
'latency': 'high',
'accessRightMethod': 'single_cookie',
'chasePlay': False,
'latency': 'high',
'protocol': 'hls',
'quality': 'abr',
},
'room': {
'protocol': 'webSocket',
'commentable': True,
},
'reconnect': False,
},
'type': 'startWatching',
}))
while True:
@ -860,15 +860,17 @@ class NiconicoLiveIE(NiconicoBaseIE):
raise ExtractorError('Disconnected at middle of extraction')
elif data.get('type') == 'error':
self.write_debug(recv)
message = traverse_obj(data, ('body', 'code', {str_or_none}), default=recv)
message = traverse_obj(data, ('body', 'code')) or recv
raise ExtractorError(message)
elif self.get_param('verbose', False):
self.write_debug(f'Server response: {truncate_string(recv, 100)}')
if len(recv) > 100:
recv = recv[:100] + '...'
self.write_debug(f'Server said: {recv}')
title = traverse_obj(embedded_data, ('program', 'title')) or self._html_search_meta(
('og:title', 'twitter:title'), webpage, 'live title', fatal=False)
raw_thumbs = traverse_obj(embedded_data, ('program', 'thumbnail', {dict})) or {}
raw_thumbs = traverse_obj(embedded_data, ('program', 'thumbnail')) or {}
thumbnails = []
for name, value in raw_thumbs.items():
if not isinstance(value, dict):
@ -895,30 +897,31 @@ class NiconicoLiveIE(NiconicoBaseIE):
cookie['domain'], cookie['name'], cookie['value'],
expire_time=unified_timestamp(cookie.get('expires')), path=cookie['path'], secure=cookie['secure'])
fmt_common = {
'live_latency': 'high',
'origin': hostname,
'protocol': 'niconico_live',
'video_id': video_id,
'ws': ws,
}
q_iter = (q for q in qualities[1:] if not q.startswith('audio_')) # ignore initial 'abr'
a_map = {96: 'audio_low', 192: 'audio_high'}
formats = self._extract_m3u8_formats(m3u8_url, video_id, ext='mp4', live=True)
for fmt in formats:
fmt['protocol'] = 'niconico_live'
if fmt.get('acodec') == 'none':
fmt['format_id'] = next(q_iter, fmt['format_id'])
elif fmt.get('vcodec') == 'none':
abr = parse_bitrate(fmt['url'].lower())
fmt.update({
'abr': abr,
'acodec': 'mp4a.40.2',
'format_id': a_map.get(abr, fmt['format_id']),
})
fmt.update(fmt_common)
return {
'id': video_id,
'title': title,
'downloader_options': {
'max_quality': traverse_obj(embedded_data, ('program', 'stream', 'maxQuality', {str})) or 'normal',
'ws': ws,
'ws_url': ws_url,
},
**traverse_obj(embedded_data, {
'view_count': ('program', 'statistics', 'watchCount'),
'comment_count': ('program', 'statistics', 'commentCount'),

View File

@ -1,41 +0,0 @@
from .floatplane import FloatplaneBaseIE
class SaucePlusIE(FloatplaneBaseIE):
IE_DESC = 'Sauce+'
_VALID_URL = r'https?://(?:(?:www|beta)\.)?sauceplus\.com/post/(?P<id>\w+)'
_BASE_URL = 'https://www.sauceplus.com'
_HEADERS = {
'Origin': _BASE_URL,
'Referer': f'{_BASE_URL}/',
}
_IMPERSONATE_TARGET = True
_TESTS = [{
'url': 'https://www.sauceplus.com/post/YbBwIa2A5g',
'info_dict': {
'id': 'eit4Ugu5TL',
'ext': 'mp4',
'display_id': 'YbBwIa2A5g',
'title': 'Scare the Coyote - Episode 3',
'description': '',
'thumbnail': r're:^https?://.*\.jpe?g$',
'duration': 2975,
'comment_count': int,
'like_count': int,
'dislike_count': int,
'release_date': '20250627',
'release_timestamp': 1750993500,
'uploader': 'Scare The Coyote',
'uploader_id': '683e0a3269688656a5a49a44',
'uploader_url': 'https://www.sauceplus.com/channel/ScareTheCoyote/home',
'channel': 'Scare The Coyote',
'channel_id': '683e0a326968866ceba49a45',
'channel_url': 'https://www.sauceplus.com/channel/ScareTheCoyote/home/main',
'availability': 'subscriber_only',
},
'params': {'skip_download': 'm3u8'},
}]
def _real_initialize(self):
if not self._get_cookies(self._BASE_URL).get('__Host-sp-sess'):
self.raise_login_required()

View File

@ -213,7 +213,7 @@ class CieloTVItIE(SkyItIE): # XXX: Do not subclass from concrete IE
class TV8ItIE(SkyItVideoIE): # XXX: Do not subclass from concrete IE
IE_NAME = 'tv8.it'
_VALID_URL = r'https?://(?:www\.)?tv8\.it/(?:show)?video/(?:[0-9a-z-]+-)?(?P<id>\d+)'
_VALID_URL = r'https?://(?:www\.)?tv8\.it/(?:show)?video/[0-9a-z-]+-(?P<id>\d+)'
_TESTS = [{
'url': 'https://www.tv8.it/video/ogni-mattina-ucciso-asino-di-andrea-lo-cicero-630529',
'md5': '9ab906a3f75ea342ed928442f9dabd21',
@ -227,19 +227,6 @@ class TV8ItIE(SkyItVideoIE): # XXX: Do not subclass from concrete IE
'thumbnail': 'https://videoplatform.sky.it/still/2020/11/18/1605717753954_ogni-mattina-ucciso-asino-di-andrea-lo-cicero_videostill_1.jpg',
},
'params': {'skip_download': 'm3u8'},
}, {
'url': 'https://www.tv8.it/video/964361',
'md5': '1e58e807154658a16edc29e45be38107',
'info_dict': {
'id': '964361',
'ext': 'mp4',
'title': 'GialappaShow - S.4 Ep.2',
'description': 'md5:60bb4ff5af18bbeeaedabc1de5f9e1e2',
'duration': 8030,
'thumbnail': 'https://videoplatform.sky.it/captures/494/2024/11/06/964361/964361_1730888412914_thumb_494.jpg',
'timestamp': 1730821499,
'upload_date': '20241105',
},
}]
_DOMAIN = 'mtv8'

View File

@ -25,7 +25,6 @@ class SportDeutschlandIE(InfoExtractor):
'upload_date': '20230114',
'timestamp': 1673733618,
},
'skip': 'not found',
}, {
'url': 'https://sportdeutschland.tv/deutscherbadmintonverband/bwf-tour-1-runde-feld-1-yonex-gainward-german-open-2022-0',
'info_dict': {
@ -42,7 +41,6 @@ class SportDeutschlandIE(InfoExtractor):
'upload_date': '20220309',
'timestamp': 1646860727.0,
},
'skip': 'not found',
}, {
'url': 'https://sportdeutschland.tv/ggcbremen/formationswochenende-latein-2023',
'info_dict': {
@ -70,7 +68,6 @@ class SportDeutschlandIE(InfoExtractor):
'live_status': 'was_live',
},
}],
'skip': 'not found',
}, {
'url': 'https://sportdeutschland.tv/dtb/gymnastik-international-tag-1',
'info_dict': {
@ -85,30 +82,13 @@ class SportDeutschlandIE(InfoExtractor):
'live_status': 'is_live',
},
'skip': 'live',
}, {
'url': 'https://sportdeutschland.tv/rostock-griffins/gfl2-rostock-griffins-vs-elmshorn-fighting-pirates',
'md5': '35c11a19395c938cdd076b93bda54cde',
'info_dict': {
'id': '9f27a97d-1544-4d0b-aa03-48d92d17a03a',
'ext': 'mp4',
'title': 'GFL2: Rostock Griffins vs. Elmshorn Fighting Pirates',
'display_id': 'rostock-griffins/gfl2-rostock-griffins-vs-elmshorn-fighting-pirates',
'channel': 'Rostock Griffins',
'channel_url': 'https://sportdeutschland.tv/rostock-griffins',
'live_status': 'was_live',
'description': 'md5:60cb00067e55dafa27b0933a43d72862',
'channel_id': '9635f21c-3f67-4584-9ce4-796e9a47276b',
'timestamp': 1749913117,
'upload_date': '20250614',
},
}]
def _process_video(self, asset_id, video):
is_live = video['type'] == 'mux_live'
token = self._download_json(
f'https://api.sportdeutschland.tv/api/web/personal/asset-token/{asset_id}',
video['id'], query={'type': video['type'], 'playback_id': video['src']},
headers={'Referer': 'https://sportdeutschland.tv/'})['token']
f'https://api.sportdeutschland.tv/api/frontend/asset-token/{asset_id}',
video['id'], query={'type': video['type'], 'playback_id': video['src']})['token']
formats, subtitles = self._extract_m3u8_formats_and_subtitles(
f'https://stream.mux.com/{video["src"]}.m3u8?token={token}', video['id'], live=is_live)

View File

@ -41,7 +41,6 @@ class SproutVideoIE(InfoExtractor):
'duration': 703,
'thumbnail': r're:https?://images\.sproutvideo\.com/.+\.jpg',
},
'skip': 'Account Disabled',
}, {
# http formats 'sd' and 'hd' are available
'url': 'https://videos.sproutvideo.com/embed/119cd6bc1a18e6cd98/30751a1761ae5b90',
@ -99,17 +98,10 @@ class SproutVideoIE(InfoExtractor):
url, smuggled_data = unsmuggle_url(url, {})
video_id = self._match_id(url)
webpage = self._download_webpage(
url, video_id, headers=traverse_obj(smuggled_data, {'Referer': 'referer'}), impersonate=True)
url, video_id, headers=traverse_obj(smuggled_data, {'Referer': 'referer'}))
data = self._search_json(
r'var\s+(?:dat|playerInfo)\s*=\s*["\']', webpage, 'player info', video_id,
contains_pattern=r'[A-Za-z0-9+/=]+', end_pattern=r'["\'];',
transform_source=lambda x: base64.b64decode(x).decode())
# SproutVideo may send player info for 'SMPTE Color Monitor Test' [a791d7b71b12ecc52e]
# e.g. if the user-agent we used with the webpage request is too old
video_uid = data['videoUid']
if video_id != video_uid:
raise ExtractorError(f'{self.IE_NAME} sent the wrong video data ({video_uid})')
r'var\s+dat\s*=\s*["\']', webpage, 'data', video_id, contains_pattern=r'[A-Za-z0-9+/=]+',
end_pattern=r'["\'];', transform_source=lambda x: base64.b64decode(x).decode())
formats, subtitles = [], {}
headers = {

View File

@ -63,7 +63,6 @@ INNERTUBE_CLIENTS = {
'INNERTUBE_CONTEXT_CLIENT_NAME': 1,
'PO_TOKEN_REQUIRED_CONTEXTS': [_PoTokenContext.GVS],
'SUPPORTS_COOKIES': True,
'PLAYER_PARAMS': '8AEB',
},
'web_embedded': {
'INNERTUBE_CONTEXT': {
@ -175,7 +174,6 @@ INNERTUBE_CLIENTS = {
},
'INNERTUBE_CONTEXT_CLIENT_NAME': 7,
'SUPPORTS_COOKIES': True,
'PLAYER_PARAMS': '8AEB',
},
'tv_simply': {
'INNERTUBE_CONTEXT': {

View File

@ -3552,11 +3552,6 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
f['format_note'] = join_nonempty(f.get('format_note'), 'MISSING POT', delim=' ')
f['source_preference'] -= 20
# XXX: Check if IOS HLS formats are affected by player PO token enforcement; temporary
# See https://github.com/yt-dlp/yt-dlp/issues/13511
if proto == 'hls' and client_name == 'ios':
f['__needs_testing'] = True
itags[itag].add(key)
if itag and all_formats:
@ -4285,7 +4280,6 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
if upload_date and live_status not in ('is_live', 'post_live', 'is_upcoming'):
# Newly uploaded videos' HLS formats are potentially problematic and need to be checked
# XXX: This is redundant for as long as we are already checking all IOS HLS formats
upload_datetime = datetime_from_str(upload_date).replace(tzinfo=dt.timezone.utc)
if upload_datetime >= datetime_from_str('today-2days'):
for fmt in info['formats']:

View File

@ -857,7 +857,7 @@ class JSInterpreter:
obj = {}
obj_m = re.search(
r'''(?x)
(?<![a-zA-Z$0-9.])%s\s*=\s*{\s*
(?<!\.)%s\s*=\s*{\s*
(?P<fields>(%s\s*:\s*function\s*\(.*?\)\s*{.*?}(?:,\s*)?)*)
}\s*;
''' % (re.escape(objname), _FUNC_NAME_RE),

View File

@ -1 +0,0 @@
# Utility functions for handling web input based on commonly used JavaScript libraries

View File

@ -1,167 +0,0 @@
from __future__ import annotations
import array
import base64
import datetime as dt
import math
import re
from .._utils import parse_iso8601
TYPE_CHECKING = False
if TYPE_CHECKING:
import collections.abc
import typing
T = typing.TypeVar('T')
_ARRAY_TYPE_LOOKUP = {
'Int8Array': 'b',
'Uint8Array': 'B',
'Uint8ClampedArray': 'B',
'Int16Array': 'h',
'Uint16Array': 'H',
'Int32Array': 'i',
'Uint32Array': 'I',
'Float32Array': 'f',
'Float64Array': 'd',
'BigInt64Array': 'l',
'BigUint64Array': 'L',
'ArrayBuffer': 'B',
}
def parse_iter(parsed: typing.Any, /, *, revivers: dict[str, collections.abc.Callable[[list], typing.Any]] | None = None):
# based on https://github.com/Rich-Harris/devalue/blob/f3fd2aa93d79f21746555671f955a897335edb1b/src/parse.js
resolved = {
-1: None,
-2: None,
-3: math.nan,
-4: math.inf,
-5: -math.inf,
-6: -0.0,
}
if isinstance(parsed, int) and not isinstance(parsed, bool):
if parsed not in resolved or parsed == -2:
raise ValueError('invalid integer input')
return resolved[parsed]
elif not isinstance(parsed, list):
raise ValueError('expected int or list as input')
elif not parsed:
raise ValueError('expected a non-empty list as input')
if revivers is None:
revivers = {}
return_value = [None]
stack: list[tuple] = [(return_value, 0, 0)]
while stack:
target, index, source = stack.pop()
if isinstance(source, tuple):
name, source, reviver = source
try:
resolved[source] = target[index] = reviver(target[index])
except Exception as error:
yield TypeError(f'failed to parse {source} as {name!r}: {error}')
resolved[source] = target[index] = None
continue
if source in resolved:
target[index] = resolved[source]
continue
# guard against Python negative indexing
if source < 0:
yield IndexError(f'invalid index: {source!r}')
continue
try:
value = parsed[source]
except IndexError as error:
yield error
continue
if isinstance(value, list):
if value and isinstance(value[0], str):
# TODO: implement zips `strict=True`
if reviver := revivers.get(value[0]):
if value[1] == source:
# XXX: avoid infinite loop
yield IndexError(f'{value[0]!r} cannot point to itself (index: {source})')
continue
# inverse order: resolve index, revive value
stack.append((target, index, (value[0], value[1], reviver)))
stack.append((target, index, value[1]))
continue
elif value[0] == 'Date':
try:
result = dt.datetime.fromtimestamp(parse_iso8601(value[1]), tz=dt.timezone.utc)
except Exception:
yield ValueError(f'invalid date: {value[1]!r}')
result = None
elif value[0] == 'Set':
result = [None] * (len(value) - 1)
for offset, new_source in enumerate(value[1:]):
stack.append((result, offset, new_source))
elif value[0] == 'Map':
result = []
for key, new_source in zip(*(iter(value[1:]),) * 2):
pair = [None, None]
stack.append((pair, 0, key))
stack.append((pair, 1, new_source))
result.append(pair)
elif value[0] == 'RegExp':
# XXX: use jsinterp to translate regex flags
# currently ignores `value[2]`
result = re.compile(value[1])
elif value[0] == 'Object':
result = value[1]
elif value[0] == 'BigInt':
result = int(value[1])
elif value[0] == 'null':
result = {}
for key, new_source in zip(*(iter(value[1:]),) * 2):
stack.append((result, key, new_source))
elif value[0] in _ARRAY_TYPE_LOOKUP:
typecode = _ARRAY_TYPE_LOOKUP[value[0]]
data = base64.b64decode(value[1])
result = array.array(typecode, data).tolist()
else:
yield TypeError(f'invalid type at {source}: {value[0]!r}')
result = None
else:
result = len(value) * [None]
for offset, new_source in enumerate(value):
stack.append((result, offset, new_source))
elif isinstance(value, dict):
result = {}
for key, new_source in value.items():
stack.append((result, key, new_source))
else:
result = value
target[index] = resolved[source] = result
return return_value[0]
def parse(parsed: typing.Any, /, *, revivers: dict[str, collections.abc.Callable[[typing.Any], typing.Any]] | None = None):
generator = parse_iter(parsed, revivers=revivers)
while True:
try:
raise generator.send(None)
except StopIteration as error:
return error.value

View File

@ -1,8 +1,8 @@
# Autogenerated by devscripts/update-version.py
__version__ = '2025.06.30'
__version__ = '2025.06.09'
RELEASE_GIT_HEAD = 'b0187844988e557c7e1e6bb1aabd4c1176768d86'
RELEASE_GIT_HEAD = '339614a173c74b42d63e858c446a9cae262a13af'
VARIANT = None
@ -12,4 +12,4 @@ CHANNEL = 'stable'
ORIGIN = 'yt-dlp/yt-dlp'
_pkg_version = '2025.06.30'
_pkg_version = '2025.06.09'