Models¶
Resyndicators¶
-
class
resyndicator.resyndicators.
Resyndicator
(title, query, session=<class 'resyndicator.models.DefaultSession'>, past=None, length=30, **kwargs)[source]¶ The Resyndicator class represents a feed that is generated from the retrieved data on the bases of an SQLAlchemy query. It is identified by the title that you sepecify on instantiation, so do not change it, because that’ll be tantamount to creating a new resyndicator.
-
class
Entry
(**kwargs)¶ Default SQLAlchemy entry representation.
-
class
Fetchers¶
Base¶
-
class
resyndicator.fetchers.base.
BaseEntryInterface
(fetcher, raw_entry)[source]¶ Base class for entries.
Subclass this to provide a unified interface to any type of entry you want to import.
-
class
Entry
(**kwargs)¶ Default SQLAlchemy entry representation.
Entry author
-
BaseEntryInterface.
content
¶ Full content of the entry if given
-
BaseEntryInterface.
content_type
¶ Either text or html (important for the Atom output)
-
BaseEntryInterface.
entry
¶ Return the SQLAlchemy entry.
Here, the source property is used to optionally include the entry source code (specified through settings.INCLUDE_SOURCE). If the entry exists, it is returned unchanged. Otherwise, it is initialized with all the new values from the supplier.
-
BaseEntryInterface.
fetched
¶ Fetch time (set to datetime.datetime.utcnow() by default)
-
BaseEntryInterface.
id
¶ A globally unique ID for internal deduplication and identification by feed readers
-
BaseEntryInterface.
link
¶ Entry link
-
BaseEntryInterface.
published
¶ Time the entry was published
-
BaseEntryInterface.
source
¶ Optional field to insert any entry source code into the content field. This can be set through settings.INCLUDE_SOURCE.
-
BaseEntryInterface.
summary
¶ Summary or description of the entry
-
BaseEntryInterface.
summary_type
¶ Either text or html (important for the Atom output)
-
BaseEntryInterface.
title
¶ Entry title
-
BaseEntryInterface.
updated
¶ Time the entry was updated on the supplier side
-
class
-
class
resyndicator.fetchers.base.
BaseFetcher
(url, interval, session=<class 'resyndicator.models.DefaultSession'>, default_tz=<class 'dateutil.tz.tz.tzutc'>, defaults=None, **kwargs)[source]¶ Base class for fetchers.
Subclass this to implement your own custom types of fetchers.
-
EntryInterface
¶ alias of
BaseEntryInterface
The author of the data source
-
entries
¶ Yield the SQLAlchemy entries after setting any default values.
-
generator
¶ The generator of the data source
-
hub
¶ Optionally the endpoint of a hub such as PubSubHubbub
-
id
¶ A unique ID for the data source, e.g., the feed
-
link
¶ Link to the data source
-
needs_update
¶ Return whether the source is ripe for an update.
-
next_check
¶ The time of the next update.
-
parse
(response)[source]¶ Implement this function to convert your data source to something the update method can work with.
-
persist
()[source]¶ Commit the entries or any updates to them to the database and return the entries that have been created.
-
retrieve
()[source]¶ Retrieve the data source.
This is by default a wrapper around the requests library that sets headers such that servers can indicate that the content hasn’t changed since the last retrieval, and by default also specifies a custom user agent and timeout.
-
subtitle
¶ Subtitle of the data source
-
title
¶ Title of the data source
-
Feed¶
Sitemap¶
-
class
resyndicator.fetchers.sitemap.
SitemapEntryInterface
(fetcher, raw_entry)[source]¶ Entry mapping for the sitemap entry.
-
id
¶ Entry ID generated from URL.
-
link
¶ Sitemap entry location (i.e. the URL).
-
source
¶ Entry raw source as JSON inside an HTML snippet.
-
updated
¶ Lastmod time of entry with fallback on publish times of video and news extensions.
-
-
class
resyndicator.fetchers.sitemap.
SitemapFetcher
(url, interval, session=<class 'resyndicator.models.DefaultSession'>, default_tz=<class 'dateutil.tz.tz.tzutc'>, defaults=None, **kwargs)[source]¶ Fetcher that supports sitemaps and recognizes some features of some sitemap extensions.
-
EntryInterface
¶ alias of
SitemapEntryInterface
-
id
¶ Sitemap ID generated from explicitly set URL.
-
-
class
resyndicator.fetchers.sitemap.
SitemapIndexFetcher
(*args, **kwargs)[source]¶ This entry point that distributes the sitemap URLs in a sitemap index on individual sitemap fetchers is still a bit of a hack. It only supports one level of sitemap indices and circumvents the request scheduling, so that it can block the scheduler for a while and sends many consecutive requests to the same host.
-
EntryInterface
¶ alias of
SitemapEntryInterface
-
class
SitemapFetcher
(url, interval, session=<class 'resyndicator.models.DefaultSession'>, default_tz=<class 'dateutil.tz.tz.tzutc'>, defaults=None, **kwargs)¶ Fetcher that supports sitemaps and recognizes some features of some sitemap extensions.
-
EntryInterface
¶ alias of
SitemapEntryInterface
-
id
¶ Sitemap ID generated from explicitly set URL.
-
static
parse
(response)¶ Return parsed sitemap.
-
update
()¶ Process sitemap.
-
-
SitemapIndexFetcher.
id
¶ Unique ID of the sitemap generated from explicitly set URL.
-
SitemapIndexFetcher.
raw_entries
¶ The raw entries as returned by parser.
-
Twitter¶
-
class
resyndicator.fetchers.twitter.
TweetInterface
(fetcher, raw_entry)[source]¶ Mapping for individual tweets.
-
class
Entry
(**kwargs)¶ Default SQLAlchemy entry representation.
The tweep.
-
TweetInterface.
content
¶ The HTML representation of the tweet.
-
TweetInterface.
entry
¶ The SQLAlchemy entry representing the tweet.
-
TweetInterface.
fetched
¶ Time the tweet was fetched.
-
TweetInterface.
id
¶ The tweet ID as string.
-
TweetInterface.
link
¶ The URL of the tweet.
-
TweetInterface.
source_id
¶ URN to identify the tweep.
-
TweetInterface.
source_link
¶ Link to the Twitter account of the author.
-
TweetInterface.
source_title
¶ Some generated source title based on the author name.
-
TweetInterface.
title
¶ Optionally shortened representation of the tweet text.
-
TweetInterface.
tweet_html
¶ Assemble a presentable HTML respresentation of the tweet.
-
TweetInterface.
tweet_text
¶ Assemble a presentable text respresentation of the tweet.
-
TweetInterface.
updated
¶ Time the tweet was created.
-
class
-
class
resyndicator.fetchers.twitter.
TwitterStreamer
(oauth_token, oauth_secret, session=<class 'resyndicator.models.DefaultSession'>, timeout=0, **kwargs)[source]¶ A Twitter streaming client that doesn’t work at the moment due to an error in the Birdy library. Please use the TwitterFetcher in the meantime.
Content¶
-
class
resyndicator.fetchers.content.
ContentFetcher
(session=<class 'resyndicator.models.DefaultSession'>, past=None, **kwargs)[source]¶ Fetcher class for retrieval and extraction of content from websites.
This fetcher incrementally downloads and extracts (using Readability) the content from any pages associated with entries that don’t already have long-form content associated with them.
-
class
Entry
(**kwargs)¶ Default SQLAlchemy entry representation.
-
ContentFetcher.
fetch
()[source]¶ Run one full fetching cycle. This is the main entry point for the content fetching process.
-
static
ContentFetcher.
get_hostname
(entry)[source]¶ Wrapper for extrating the hostname from entry links. (Another ContentFetcher might need to remove the .www.)
-
class