mirror of
https://github.com/Radiquum/furaffinity-dl.git
synced 2025-04-05 07:44:37 +00:00
First public release of the python rewrite
Very late due to me being very busy + procrastinating
This commit is contained in:
parent
4f3c9eb6bb
commit
972dacb5bd
5 changed files with 198 additions and 276 deletions
27
.gitignore
vendored
27
.gitignore
vendored
|
@ -1,22 +1,7 @@
|
|||
# Pictures
|
||||
*.jpg
|
||||
*.png
|
||||
*.bmp
|
||||
*.gif
|
||||
# Meta
|
||||
*.meta
|
||||
# Audio
|
||||
*.mp3
|
||||
*.wav
|
||||
*.wmv
|
||||
# Flash
|
||||
*.swf
|
||||
# Documents
|
||||
*.txt
|
||||
*.pdf
|
||||
*.doc
|
||||
*.docx
|
||||
# Swap files
|
||||
*.swp
|
||||
# Cookies
|
||||
# My cookies, committing them to a public repo would not be a good idea
|
||||
cookies.txt
|
||||
|
||||
# Downloaded types
|
||||
*.png
|
||||
*.jpg
|
||||
*.json
|
29
LICENSE
29
LICENSE
|
@ -1,28 +1,7 @@
|
|||
Copyright (c) 2015, Sergey "Shnatsel" Davidoff
|
||||
All rights reserved.
|
||||
Copyright 2020 Xerbo (xerbo@protonmail.com)
|
||||
|
||||
Redistribution and use in source and binary forms, with or without
|
||||
modification, are permitted provided that the following conditions are met:
|
||||
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
|
||||
|
||||
* Redistributions of source code must retain the above copyright notice, this
|
||||
list of conditions and the following disclaimer.
|
||||
|
||||
* Redistributions in binary form must reproduce the above copyright notice,
|
||||
this list of conditions and the following disclaimer in the documentation
|
||||
and/or other materials provided with the distribution.
|
||||
|
||||
* Neither the name of furaffinity-dl nor the names of its
|
||||
contributors may be used to endorse or promote products derived from
|
||||
this software without specific prior written permission.
|
||||
|
||||
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
|
||||
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
||||
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
|
||||
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
|
||||
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
|
||||
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
|
||||
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
|
||||
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
|
||||
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
|
||||
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
|
||||
|
||||
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
|
29
README.md
29
README.md
|
@ -1,38 +1,39 @@
|
|||
This branch is the development version of furaffinity-dl rewritten in python.
|
||||
|
||||
# FurAffinity Downloader
|
||||
**furaffinity-dl** is a bash script for batch downloading of galleries and favorites from furaffinity.net users.
|
||||
**furaffinity-dl** is a python script for batch downloading of galleries (and scraps/favourites) from furaffinity.net users.
|
||||
It was written for preservation of culture, to counter the people nuking their galleries every once a while.
|
||||
|
||||
Supports all known submission types: images, texts and audio.
|
||||
|
||||
## Requirements
|
||||
Coreutils, bash and wget are the only dependencies. However if you want to embed metadata into files you will need eyed3 and exiftool
|
||||
The exacts are unknown due to the fact that this is still early in development, you should only need beautifulsoup4 to be installed though. I will put a `requirements.txt` in the repo soon
|
||||
|
||||
furaffinity-dl was tested only on Linux. It should also work on Mac and BSDs.
|
||||
Windows users can get it to work via Microsoft's [WSL](https://docs.microsoft.com/en-us/windows/wsl/install-win10). Cygwin is not supported.
|
||||
furaffinity-dl was tested only on Linux. It should also work on Mac, Windows and any other platform that supports python.
|
||||
|
||||
## Usage
|
||||
Make it executable with
|
||||
`chmod +x faraffinity-dl`
|
||||
And then run it with
|
||||
`./furaffinity-dl section/username`
|
||||
Run it with
|
||||
`./furaffinity-dl.py category username`
|
||||
or:
|
||||
`python3 furaffinity-dl.py category username`
|
||||
|
||||
All files from the given section and user will be downloaded to the current directory.
|
||||
|
||||
### Examples
|
||||
`./furaffinity-dl gallery/mylafox`
|
||||
`python3 fadl.py gallery koul`
|
||||
|
||||
`./furaffinity-dl -o mylasArt gallery/mylafox`
|
||||
`python3 fadl.py -o koulsArt gallery koul`
|
||||
|
||||
`./furaffinity-dl -o koulsFavs favorites/koul`
|
||||
`python3 fadl.py -o mylasFavs favorites mylafox`
|
||||
|
||||
For a full list of command line arguemnts use `./furaffinity-dl -h`.
|
||||
For a full list of command line arguments use `./furaffinity-dl -h`.
|
||||
|
||||
You can also log in to download restricted content. To do that, log in to FurAffinity in your web browser, export cookies to a file from your web browser in Netscape format (there are extensions to do that [for Firefox](https://addons.mozilla.org/en-US/firefox/addon/ganbo/) and [for Chrome/Vivaldi](https://chrome.google.com/webstore/detail/cookiestxt/njabckikapfpffapmjgojcnbfjonfjfg)) and pass them to the script as a second parameter, like this:
|
||||
|
||||
`./furaffinity-dl -c /path/to/your/cookies.txt gallery/gonnaneedabiggerboat`
|
||||
`python3 fadl.py -c cookies.txt gallery letodoesartt`
|
||||
|
||||
## TODO
|
||||
* Download user bio, post tags and ideally user comments
|
||||
* Download user information.
|
||||
|
||||
## Disclaimer
|
||||
It is your own responsibility to check whether batch downloading is allowed by FurAffinity's terms of service and to abide by them. For further disclaimers see LICENSE.
|
||||
|
|
216
furaffinity-dl
216
furaffinity-dl
|
@ -1,216 +0,0 @@
|
|||
#!/bin/bash
|
||||
# shellcheck disable=SC2001
|
||||
set -e
|
||||
|
||||
# Default options
|
||||
outdir="."
|
||||
prefix="https:"
|
||||
metadata=true
|
||||
rename=true
|
||||
maxsavefiles="0"
|
||||
overwrite=false
|
||||
textmeta=false
|
||||
classic=false
|
||||
|
||||
# Helper functions
|
||||
help() {
|
||||
echo "Usage: $0 [ARGUMENTS] SECTION/USER
|
||||
Downloads the entire gallery/scraps/favorites of any furaffinity user.
|
||||
|
||||
Arguments:
|
||||
-h (H)elp screen
|
||||
-i Use an (I)nsecure connection when downloading
|
||||
-o The (O)utput directory to put files in
|
||||
-c If you need to download restricted content
|
||||
you can provide a path to a (C)ookie file
|
||||
-p (P)lain file without any additional metadata
|
||||
-r Don't (R)ename files, just give them the same
|
||||
filename as on facdn
|
||||
-n (N)unmber of images to download, starting from
|
||||
the most recent submission
|
||||
-w Over(Write) files if they already exist
|
||||
-s (S)eperate metadata files, to make sure all
|
||||
metadata is downloaded regardless of file
|
||||
-t Not using the \"beta\" (T)heme
|
||||
|
||||
Examples:
|
||||
$0 gallery/mylafox
|
||||
$0 -o mylasArt gallery/mylafox
|
||||
$0 -o koulsFavs favorites/koul
|
||||
|
||||
You can also log in to FurAffinity to download restricted content, like this:
|
||||
$0 -c /path/to/your/cookies.txt gallery/gonnaneedabiggerboat
|
||||
|
||||
DISCLAIMER: It is your own responsibility to check whether batch downloading is allowed by FurAffinity terms of service and to abide by them."
|
||||
exit 1
|
||||
}
|
||||
|
||||
# Display help if no arguments given
|
||||
[[ $# -eq 0 ]] && help
|
||||
|
||||
# Options via arguments
|
||||
while getopts 'o:c:n:iphrwst' flag; do
|
||||
case "${flag}" in
|
||||
t) classic=true;;
|
||||
w) overwrite=true;;
|
||||
o) outdir=${OPTARG};;
|
||||
c) cookiefile=${OPTARG};;
|
||||
i) prefix="http:";;
|
||||
p) metadata=false;;
|
||||
r) rename=false;;
|
||||
n) maxsavefiles=${OPTARG};;
|
||||
h) help;;
|
||||
s) textmeta=true;;
|
||||
*) help;;
|
||||
esac
|
||||
done
|
||||
|
||||
# Detect installed metadata injectors
|
||||
eyed3=true
|
||||
if [ -z "$(command -v eyeD3)" ]; then
|
||||
eyed3=false
|
||||
echo "INFO: eyed3 is not installed, no metadata will be injected into music files."
|
||||
fi
|
||||
|
||||
exiftool=true
|
||||
if [ -z "$(command -v exiftool)" ]; then
|
||||
exiftool=false
|
||||
echo "INFO: exiftool is not installed, no metadata will be injected into pictures."
|
||||
fi
|
||||
|
||||
cleanup() {
|
||||
rm -f "$tempfile"
|
||||
}
|
||||
|
||||
# Attempt to create the output directory
|
||||
mkdir -p -- "$outdir"
|
||||
|
||||
# Setup temporarily file with 600 perms
|
||||
tempfile="$(umask u=rwx,g=,o= && mktemp --suffix=_fa-dl)"
|
||||
|
||||
# Call cleanup function on exit
|
||||
trap cleanup EXIT
|
||||
|
||||
if [ -z "$cookiefile" ]; then
|
||||
# Set wget with a custom user agent
|
||||
fwget() {
|
||||
wget --quiet --user-agent="Mozilla/5.0 furaffinity-dl (https://github.com/Xerbo/furaffinity-dl)" "$@"
|
||||
}
|
||||
else
|
||||
# Set wget with a custom user agent and cookies
|
||||
fwget() {
|
||||
wget --quiet --user-agent="Mozilla/5.0 furaffinity-dl (https://github.com/Xerbo/furaffinity-dl)" --load-cookies "$cookiefile" "$@"
|
||||
}
|
||||
fi
|
||||
|
||||
url="https://www.furaffinity.net/${*: -1}"
|
||||
download_count="0"
|
||||
|
||||
# Iterate over the gallery pages with thumbnails and links to artwork view pages
|
||||
while true; do
|
||||
fwget "$url" -O "$tempfile"
|
||||
if [ -n "$cookiefile" ] && grep -q 'furaffinity.net/login/' "$tempfile"; then
|
||||
echo "ERROR: You have provided a cookies file, but it does not contain valid cookies.
|
||||
|
||||
If this file used to work, this means that the cookies have expired;
|
||||
you will have to log in to FurAffinity from your web browser and export the cookies again.
|
||||
|
||||
If this is the first time you're trying to use cookies, make sure you have exported them
|
||||
in Netscape format (this is normally done through \"cookie export\" browser extensions)
|
||||
and supplied the correct path to the cookies.txt file to this script.
|
||||
|
||||
If that doesn't resolve the issue, please report the problem at
|
||||
https://github.com/Xerbo/furaffinity-dl/issues" >&2
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# Get URL for next page out of "Next" button. Required for favorites, pages of which are not numbered
|
||||
if [ $classic = true ]; then
|
||||
next_page_url="$(grep '<a class="button-link right" href="' "$tempfile" | grep '">Next ❯❯</a>' | cut -d '"' -f 4 | sort -u)"
|
||||
else
|
||||
next_page_url="$(grep -B 1 --max-count=1 'type="submit">Next' "$tempfile" | grep form | cut -d '"' -f 2)"
|
||||
fi
|
||||
|
||||
# Extract links to pages with individual artworks and iterate over them
|
||||
artwork_pages="$(grep '<a href="/view/' "$tempfile" | grep -E --only-matching '/view/[[:digit:]]+/' | uniq)"
|
||||
for page in $artwork_pages; do
|
||||
# Download the submission page
|
||||
fwget -O "$tempfile" "https://www.furaffinity.net$page"
|
||||
|
||||
if grep -q "System Message" "$tempfile"; then
|
||||
echo "WARNING: $page seems to be inaccessible, skipping."
|
||||
continue
|
||||
fi
|
||||
|
||||
# Get the full size image URL.
|
||||
# This will be a facdn.net link, we will default to HTTPS
|
||||
# but this can be disabled with -i or --http for specific reasons
|
||||
image_url="$prefix$(grep --only-matching --max-count=1 ' href="//d.facdn.net/art/.\+">Download' "$tempfile" | cut -d '"' -f 2)"
|
||||
|
||||
# Get metadata
|
||||
description="$(grep 'og:description" content="' "$tempfile" | cut -d '"' -f 4)"
|
||||
if [ $classic = true ]; then
|
||||
title="$(grep -Eo '<h2>.*</h2>' "$tempfile" | awk -F "<h2>" '{print $2}' | awk -F "</h2>" '{print $1}')"
|
||||
else
|
||||
title="$(grep -Eo '<h2><p>.*</p></h2>' "$tempfile" | awk -F "<p>" '{print $2}' | awk -F "</p>" '{print $1}')"
|
||||
fi
|
||||
|
||||
file_type="${image_url##*.}"
|
||||
file_name="$(echo "$image_url" | cut -d "/" -f 7)"
|
||||
if [[ "$file_name" =~ ^[0-9]{0,12}$ ]]; then
|
||||
file_name="$(echo "$image_url" | cut -d "/" -f 8)"
|
||||
fi
|
||||
|
||||
# Choose the output path
|
||||
if [ $rename = true ]; then
|
||||
# FIXME titles that are just a single emoji get changed to " " and overwrite eachother
|
||||
file="$outdir/$(echo "$title" | sed -e 's/[^A-Za-z0-9._-]/ /g').$file_type"
|
||||
else
|
||||
file="$outdir/$file_name"
|
||||
fi
|
||||
|
||||
# Download the image
|
||||
if [ ! -f "$file" ] || [ $overwrite = true ] ; then
|
||||
wget --quiet --show-progress "$image_url" -O "$file"
|
||||
else
|
||||
echo "File already exists, skipping. Use -w to skip this check"
|
||||
fi
|
||||
|
||||
mime_type="$(file -- "$file")"
|
||||
|
||||
if [ $textmeta = true ]; then
|
||||
echo -ne "Title: $title\nURL: $page\nFilename: $file_name\nDescription: $description" > "$file.meta"
|
||||
fi
|
||||
|
||||
# Add metadata
|
||||
if [[ $mime_type == *"audio"* ]]; then
|
||||
# Use eyeD3 for injecting metadata into audio files (if it's installed)
|
||||
if [ $eyed3 = true ] && [ $metadata = true ]; then
|
||||
if [ -z "$description" ]; then
|
||||
eyeD3 -t "$title" -- "$file" || true
|
||||
else
|
||||
# HACK: eyeD3 throws an error if a description containing a ":"
|
||||
eyeD3 -t "$title" --add-comment "${description//:/\\:}" -- "$file" || true
|
||||
fi
|
||||
fi
|
||||
elif [[ $mime_type == *"image"* ]]; then
|
||||
# Use exiftool for injecting metadata into pictures (if it's installed)
|
||||
if [ $exiftool = true ] && [ $metadata = true ]; then
|
||||
cat -- "$file" | exiftool -description="$description" -title="$title" -overwrite_original - > "$tempfile" && mv -- "$tempfile" "$file" || true
|
||||
fi
|
||||
fi
|
||||
|
||||
# If there is a file download limit then keep track of it
|
||||
if [ "$maxsavefiles" -ne "0" ]; then
|
||||
download_count="$((download_count + 1))"
|
||||
|
||||
if [ "$download_count" -ge "$maxsavefiles" ]; then
|
||||
echo "Reached set file download limit."
|
||||
exit 0
|
||||
fi
|
||||
fi
|
||||
done
|
||||
|
||||
[ -z "$next_page_url" ] && break
|
||||
url='https://www.furaffinity.net'"$next_page_url"
|
||||
done
|
173
furaffinity-dl.py
Executable file
173
furaffinity-dl.py
Executable file
|
@ -0,0 +1,173 @@
|
|||
#!/usr/bin/python3
|
||||
import argparse
|
||||
from argparse import RawTextHelpFormatter
|
||||
import json
|
||||
from bs4 import BeautifulSoup
|
||||
import requests
|
||||
import urllib.request
|
||||
import http.cookiejar as cookielib
|
||||
import urllib.parse
|
||||
import re
|
||||
import os
|
||||
|
||||
'''
|
||||
Please refer to LICENSE for licensing conditions.
|
||||
|
||||
current ideas / things to do:
|
||||
-r replenish, keep downloading until it finds a already downloaded file
|
||||
-n number of posts to download
|
||||
file renaming to title
|
||||
metadata injection (gets messy easily)
|
||||
sqlite database
|
||||
support for beta theme
|
||||
using `requests` instead of `urllib`
|
||||
turn this into a module
|
||||
'''
|
||||
|
||||
# Argument parsing
|
||||
parser = argparse.ArgumentParser(formatter_class=RawTextHelpFormatter, description='Downloads the entire gallery/scraps/favorites of a furaffinity user', epilog='''
|
||||
Examples:
|
||||
python3 fadl.py gallery koul
|
||||
python3 fadl.py -o koulsArt gallery koul
|
||||
python3 fadl.py -o mylasFavs favorites mylafox\n
|
||||
You can also log in to FurAffinity in a web browser and load cookies to download restricted content:
|
||||
python3 fadl.py -c cookies.txt gallery letodoesart\n
|
||||
DISCLAIMER: It is your own responsibility to check whether batch downloading is allowed by FurAffinity terms of service and to abide by them.
|
||||
''')
|
||||
parser.add_argument('category', metavar='category', type=str, nargs='?', default='gallery',
|
||||
help='the category to download, gallery/scraps/favorites')
|
||||
parser.add_argument('username', metavar='username', type=str, nargs='?',
|
||||
help='username of the furaffinity user')
|
||||
parser.add_argument('-o', metavar='output', dest='output', type=str, default='.', help="output directory")
|
||||
parser.add_argument('-c', metavar='cookies', dest='cookies', type=str, default='', help="path to a NetScape cookies file")
|
||||
parser.add_argument('-s', metavar='start', dest='start', type=int, default=1, help="page number to start from")
|
||||
|
||||
args = parser.parse_args()
|
||||
if args.username == None:
|
||||
parser.print_help()
|
||||
exit()
|
||||
|
||||
# Create output directory if it doesn't exist
|
||||
if args.output != '.':
|
||||
os.makedirs(args.output, exist_ok=True)
|
||||
|
||||
# Check validity of category
|
||||
valid_categories = ['gallery', 'favorites', 'scraps']
|
||||
if not args.category in valid_categories:
|
||||
raise Exception('Category is not valid', args.category)
|
||||
|
||||
# Check validity of username
|
||||
if bool(re.compile(r'[^a-zA-Z0-9\-~._]').search(args.username)):
|
||||
raise Exception('Username contains non-valid characters', args.username)
|
||||
|
||||
# Initialise a session
|
||||
session = requests.Session()
|
||||
session.headers.update({'User-Agent': 'furaffinity-dl redevelopment'})
|
||||
|
||||
# Load cookies from a netscape cookie file (if provided)
|
||||
if args.cookies != '':
|
||||
cookies = cookielib.MozillaCookieJar(args.cookies)
|
||||
cookies.load()
|
||||
session.cookies = cookies
|
||||
|
||||
base_url = 'https://www.furaffinity.net'
|
||||
gallery_url = '{}/gallery/{}'.format(base_url, args.username)
|
||||
page_num = args.start
|
||||
|
||||
# The cursed function that handles downloading
|
||||
def download_file(path):
|
||||
page_url = '{}{}'.format(base_url, path)
|
||||
response = session.get(page_url)
|
||||
s = BeautifulSoup(response.text, 'html.parser')
|
||||
|
||||
image = s.find(class_='download').find('a').attrs.get('href')
|
||||
title = s.find(class_='submission-title').find('p').contents[0];
|
||||
filename = image.split("/")[-1:][0]
|
||||
data = {
|
||||
'id': int(path.split('/')[-2:-1][0]),
|
||||
'filename': filename,
|
||||
'author': s.find(class_='submission-id-sub-container').find('a').find('strong').text,
|
||||
'date': s.find(class_='popup_date').attrs.get('title'),
|
||||
'title': title,
|
||||
'description': s.find(class_='submission-description').text.strip().replace('\r\n', '\n'),
|
||||
"tags": [],
|
||||
'views': int(s.find(class_='views').find(class_='font-large').text),
|
||||
'favorites': int(s.find(class_='favorites').find(class_='font-large').text),
|
||||
'rating': s.find(class_='rating-box').text.strip(),
|
||||
'comments': []
|
||||
}
|
||||
|
||||
# Extact tags
|
||||
for tag in s.find(class_='tags-row').findAll(class_='tags'):
|
||||
data['tags'].append(tag.find('a').text)
|
||||
|
||||
# Extract comments
|
||||
for comment in s.findAll(class_='comment_container'):
|
||||
temp_ele = comment.find(class_='comment-parent')
|
||||
parent_cid = None if temp_ele == None else int(temp_ele.attrs.get('href')[5:])
|
||||
|
||||
# Comment deleted or hidden
|
||||
if comment.find(class_='comment-link') == None:
|
||||
continue
|
||||
|
||||
data['comments'].append({
|
||||
'cid': int(comment.find(class_='comment-link').attrs.get('href')[5:]),
|
||||
'parent_cid': parent_cid,
|
||||
'content': comment.find(class_='comment_text').contents[0].strip(),
|
||||
'username': comment.find(class_='comment_username').text,
|
||||
'date': comment.find(class_='popup_date').attrs.get('title')
|
||||
})
|
||||
|
||||
# Write a UTF-8 encoded JSON file for metadata
|
||||
with open(os.path.join(args.output, '{}.json'.format(filename)), 'w', encoding='utf-8') as f:
|
||||
json.dump(data, f, ensure_ascii=False, indent=4)
|
||||
|
||||
print('Downloading "{}"... '.format(title))
|
||||
|
||||
# Because for some god forsaken reason FA keeps the original filename in the upload, in the case that it contains non-ASCII
|
||||
# characters it can make this thing blow up. So we have to do some annoying IRI stuff to make it work. Maybe consider `requests`
|
||||
# instead of urllib
|
||||
def strip_non_ascii(s): return ''.join(i for i in s if ord(i) < 128)
|
||||
url = 'https:{}'.format(image)
|
||||
url = urllib.parse.urlsplit(url)
|
||||
url = list(url)
|
||||
url[2] = urllib.parse.quote(url[2])
|
||||
url = urllib.parse.urlunsplit(url)
|
||||
urllib.request.urlretrieve(url, os.path.join(args.output, strip_non_ascii(filename)))
|
||||
|
||||
# Main downloading loop
|
||||
while True:
|
||||
page_url = '{}/{}'.format(gallery_url, page_num)
|
||||
response = session.get(page_url)
|
||||
s = BeautifulSoup(response.text, 'html.parser')
|
||||
|
||||
# Account status
|
||||
if page_num == 1:
|
||||
if s.find(class_='loggedin_user_avatar') != None:
|
||||
account_username = s.find(class_='loggedin_user_avatar').attrs.get('alt')
|
||||
print('Logged in as', account_username)
|
||||
else:
|
||||
print('Not logged in, some users gallery\'s may be unaccessible and NSFW content is not downloadable')
|
||||
|
||||
# System messages
|
||||
if s.find(class_='notice-message') != None:
|
||||
message = s.find(class_='notice-message').find('div')
|
||||
for ele in message:
|
||||
if ele.name != None:
|
||||
ele.decompose()
|
||||
|
||||
raise Exception('System Message', message.text.strip())
|
||||
|
||||
# End of gallery
|
||||
if s.find(id='no-images') != None:
|
||||
print('End of gallery')
|
||||
break
|
||||
|
||||
# Download all images on the page
|
||||
for img in s.findAll('figure'):
|
||||
download_file(img.find('a').attrs.get('href'))
|
||||
|
||||
page_num += 1
|
||||
print('Downloading page', page_num)
|
||||
|
||||
print('Finished downloading')
|
Loading…
Add table
Reference in a new issue