Couple weeks ago I changed my workflow regarding reading/sending emails. So to
have more control over my emails I started using
offlineimap which can download your emails
offline to a directory on your filesystem. I then used
Mu which index this directory so you can
search for emails offline using some queries, like “show me unread emails from
inboxes” or you can search for a word in all your emails from all inboxes. Over
all of that I used Mu4e which
is an email client inside Emacs (my default editor). As I’m using
Spacemacs So I added a binding that opens Mu4e
using SPC M
and boom I can see all my emails, I can search with s
and I have
a bookmark that shows all unread emails using bi
.
Now I want the same for RSS.
Here is the problem, I made some research and couldn’t find similar tools that
does the same for RSS, although it would be easier, there are no authentication
required like IMAP/SMTP servers, So I spent an hour or so writing a small script
that does the same as offlineimap
, this script on my machine is called
offlinerss
, it’s the first piece of the puzzle and it looks like that
#!/usr/bin/env ruby
# frozen_string_literal: true
require 'bundler/inline'
require 'open-uri'
require 'fileutils'
require 'digest'
require 'yaml'
gemfile do
source 'https://rubygems.org'
gem 'rss'
end
def mkdir(*paths)
path = File.join(*paths)
FileUtils.mkdir(path) unless Dir.exist?(path)
path
end
destination = mkdir(File.expand_path('~/rss/'))
inbox = mkdir(destination, 'INBOX')
meta_dir = mkdir(destination, '.meta')
config_file = File.join(destination, 'config.yml')
config = YAML.load_file(config_file)
urls = config['urls']
urls.each do |url|
url_digest = Digest::SHA1.hexdigest(url)
URI.open(url) do |rss|
content = rss.read
feed = RSS::Parser.parse(content)
feed.items.each do |item|
id = item.respond_to?(:id) ? item.id : item.guid
id_digest = Digest::SHA1.hexdigest(id.content)
file_basename = url_digest + '-' + id_digest + '.xml'
next unless Dir.glob(File.join(destination, '**', file_basename)).empty?
filename = File.join(inbox, file_basename)
File.write(filename, item.to_s)
end
[{ start_tag: '<entry>', end_tag: '</entry>' }, { start_tag: '<item>', end_tag: '</item>' }].each do |tag|
next unless content.include?(tag[:start_tag])
content[content.index(tag[:start_tag])...(content.rindex(tag[:end_tag]) + tag[:end_tag].length)] = ''
end
metafile = File.join(meta_dir, url_digest + '.xml')
File.write(metafile, content)
end
end
I have a small config file in ~/rss/config.yml
which has all the URLs I care
for, so far just ruby/rails/go main blogs to be alerted by the latest versions.
urls:
- https://server.tld/feed.rss
- https://server.tld/feed.atom
This just reads the URLs, and saves each entry to a file on your machine
~/rss/INBOX
if the file doesn’t exist in any sub directory in ~/rss
. Then it
removes all entries/items from the feed and save the rest to ~/rss/.meta
.
The file names of the RSS item is sha1(url)-sha1(item.id).xml
and the meta
file name is sha1(url).xml
very simple.
So now I need to write a client that reads files in ~/rss/
and render the XML
files and some actions to create directories under ~/rss/
and actions to move
the file to another directory after it’s read or the user want to move it to
read-later
or something, just like emails directories.
Another piece of the puzzle is indexer like what Mu
does for the emails.
I was surprised that it was easy to just sit down and write the thing for myself, than searching for days for a solution.