Explorar el Código

initial commit

lc hace 11 meses
commit
48be72c866
Se han modificado 10 ficheros con 427 adiciones y 0 borrados
  1. 4 0
      .env
  2. 1 0
      .gitignore
  3. 42 0
      build/Dockerfile
  4. 4 0
      build/calibre-cron
  5. 17 0
      build/entrypoint.sh
  6. 5 0
      crond/calibre-cron
  7. 20 0
      docker-compose.yml
  8. 169 0
      readme.md
  9. 135 0
      recipe/oriental_daily_pure.recipe
  10. 30 0
      recipe/upkindle.sh

+ 4 - 0
.env

@@ -0,0 +1,4 @@
+CALIBRE_NAME=calibre-cronx2
+CALIBRE_RECIPE=~/projects/gog/Dockers_real/calibre-cronx/recipe
+CALIBRE_CRON=~/projects/gog/Dockers_real/calibre-cronx/crond
+NEWS_PATH=~/projects/gog/Dockers_real/calibre-cronx/news

+ 1 - 0
.gitignore

@@ -0,0 +1 @@
+news/

+ 42 - 0
build/Dockerfile

@@ -0,0 +1,42 @@
+# Use the official Ubuntu as the base image
+FROM ubuntu:latest
+
+# Install dependencies
+RUN apt-get update && apt-get install -y \
+    vim \
+    cron \
+    tzdata \
+    calibre \
+    && rm -rf /var/lib/apt/lists/*
+
+# Create a directory for Calibre library
+RUN mkdir -p /calibre/library
+
+
+# Copy your cron job file into the container
+COPY calibre-cron /etc/cron.d/calibre-cron
+
+# Give execution rights to the cron job
+RUN chmod 0644 /etc/cron.d/calibre-cron
+
+# Apply the cron job
+RUN crontab /etc/cron.d/calibre-cron
+
+# Create the log file (optional, for debugging)
+RUN touch /var/log/calibre-cron.log
+
+
+# Create a directory for custom cron jobs (optional)
+RUN mkdir -p /etc/cron.d
+
+# Set the working directory
+WORKDIR /calibre
+
+# Copy the entrypoint script into the container
+COPY entrypoint.sh /entrypoint.sh
+
+# Make the entrypoint script executable
+RUN chmod +x /entrypoint.sh
+
+# Set the entrypoint to the script
+ENTRYPOINT ["/entrypoint.sh"]

+ 4 - 0
build/calibre-cron

@@ -0,0 +1,4 @@
+# Run a script every minute
+*/2 * * * *  echo "Hello, World Wide `date` !" >> /var/log/cron.log 2>&1
+#*/3 * * * *  /calibre/recipe/upkindle.sh >> /var/log/upkindle.log 2>&1
+0 5  * * *  /calibre/recipe/upkindle.sh >> /var/log/upkindle.log 2>&1

+ 17 - 0
build/entrypoint.sh

@@ -0,0 +1,17 @@
+#!/bin/sh
+
+# Check if the TZ environment variable is set
+if [ -n "$TZ" ]; then
+  # Set the timezone
+  echo "Setting timezone to $TZ"
+  ln -fs /usr/share/zoneinfo/$TZ /etc/localtime && dpkg-reconfigure -f noninteractive tzdata
+else
+  echo "Timezone not specified. Using default timezone."
+fi
+
+# Start cron
+echo "Starting cron..."
+cron -f
+
+
+

+ 5 - 0
crond/calibre-cron

@@ -0,0 +1,5 @@
+# Run a script every minute
+* * * * *  echo "Hello, World Wide `date` !" >> /var/log/cron.log 2>&1
+#* * * * *  /calibre/recipe/a.sh >> /var/log/cron.log 2>&1
+#*/3 * * * *  /calibre/recipe/upkindle.sh >> /var/log/upkindle.log 2>&1
+0 5  * * *  /calibre/recipe/upkindle.sh >> /var/log/upkindle.log 2>&1

+ 20 - 0
docker-compose.yml

@@ -0,0 +1,20 @@
+---
+services:
+
+  calibre-cron:
+    image: calibre-cron:0.9
+    container_name: ${CALIBRE_NAME}
+    environment:
+      - PUID=1000
+      - PGID=1000
+      - TZ=Asia/Hong_Kong
+    volumes:
+      - ${CALIBRE_CRON}:/etc/cron.d
+      - ${CALIBRE_RECIPE}:/calibre/recipe 
+      - ${NEWS_PATH}:/news
+    ports:
+      - 8780:8080
+      - 8781:8081
+    restart: "no"
+
+

+ 169 - 0
readme.md

@@ -0,0 +1,169 @@
+# Calibre Docker Container with Docker Compose
+
+This repository contains a Dockerfile, Docker Compose configuration, and associated scripts to set up a Calibre environment within a Docker container. The container is based on the official Ubuntu image and includes Calibre, cron for scheduling tasks, and an entrypoint script to handle timezone configuration and cron service startup. The Docker Compose file simplifies the deployment process by allowing you to configure environment variables, volumes, and ports in a single file.
+
+---
+
+## Features
+
+- **Calibre Installation**: The container installs Calibre, a powerful and easy-to-use e-book manager.
+- **Cron Jobs**: Supports custom cron jobs for automated tasks, such as running scripts at specified intervals.
+- **Timezone Configuration**: Allows setting the timezone via the `TZ` environment variable.
+- **Logging**: Optional logging for cron jobs to facilitate debugging.
+- **Docker Compose Support**: Easily manage container configuration, volumes, and ports using `docker-compose.yml`.
+
+---
+
+## Getting Started
+
+### Prerequisites
+
+- Docker installed on your machine.
+- Docker Compose installed (if not, follow the [official guide](https://docs.docker.com/compose/install/)).
+
+---
+
+### Docker Compose Configuration
+
+The `docker-compose.yml` file simplifies the deployment process. Below is the configuration:
+
+```yaml
+services:
+  calibre-cron:
+    image: calibre-cron:0.9
+    container_name: ${CALIBRE_NAME}
+    environment:
+      - PUID=1000
+      - PGID=1000
+      - TZ=Asia/Hong_Kong
+    volumes:
+      - ${CALIBRE_CRON}:/etc/cron.d
+      - ${CALIBRE_RECIPE}:/calibre/recipe 
+      - ${NEWS_PATH}:/news
+    ports:
+      - 8780:8080
+      - 8781:8081
+    restart: "no"
+```
+
+#### Environment Variables
+
+- `CALIBRE_NAME`: The name of the container.
+- `PUID`: User ID for file permissions (default: `1000`).
+- `PGID`: Group ID for file permissions (default: `1000`).
+- `TZ`: Timezone (default: `Asia/Hong_Kong`).
+- `CALIBRE_CRON`: Path to the host directory for custom cron jobs.
+- `CALIBRE_RECIPE`: Path to the host directory for Calibre recipes.
+- `NEWS_PATH`: Path to the host directory for news files.
+
+#### Ports
+
+- `8780:8080`: Maps host port 8780 to container port 8080.
+- `8781:8081`: Maps host port 8781 to container port 8081.
+
+#### Volumes
+
+- `${CALIBRE_CRON}:/etc/cron.d`: Mounts the host directory for custom cron jobs.
+- `${CALIBRE_RECIPE}:/calibre/recipe`: Mounts the host directory for Calibre recipes.
+- `${NEWS_PATH}:/news`: Mounts the host directory for news files.
+
+---
+
+### Building the Docker Image
+
+To build the Docker image, navigate to the directory containing the Dockerfile and run:
+
+```bash
+docker build -t calibre-cron:0.9 .
+```
+
+---
+
+### Running with Docker Compose
+
+1. Create a `.env` file in the same directory as `docker-compose.yml` and define the required environment variables:
+
+```bash
+CALIBRE_NAME=my-calibre-container
+CALIBRE_CRON=/path/to/cron/jobs
+CALIBRE_RECIPE=/path/to/calibre/recipes
+NEWS_PATH=/path/to/news/files
+```
+
+2. Start the container using Docker Compose:
+
+```bash
+docker-compose up -d
+```
+
+---
+
+### Custom Cron Jobs
+
+You can add custom cron jobs by placing them in the `${CALIBRE_CRON}` directory on your host machine. The provided `calibre-cron` file is an example of a cron job that runs a script at 5 AM daily:
+
+```bash
+0 5 * * * /calibre/recipe/upkindle.sh >> /var/log/upkindle.log 2>&1
+```
+
+---
+
+### Entrypoint Script
+
+The `entrypoint.sh` script handles timezone configuration and starts the cron service. It checks for the `TZ` environment variable and sets the timezone accordingly.
+
+---
+
+## Directory Structure
+
+- `/calibre/library`: Directory for storing Calibre library files.
+- `/etc/cron.d`: Directory for custom cron jobs.
+- `/var/log/calibre-cron.log`: Log file for cron job output (optional).
+- `/calibre/recipe`: Directory for Calibre recipes.
+- `/news`: Directory for news files.
+
+---
+
+## Example Cron Job
+
+The `calibre-cron` file contains an example cron job that runs a script at 5 AM daily:
+
+```bash
+0 5 * * * /calibre/recipe/upkindle.sh >> /var/log/upkindle.log 2>&1
+```
+
+---
+
+## Logging
+
+Cron job output can be logged to `/var/log/calibre-cron.log` for debugging purposes. Ensure your cron jobs redirect their output to this file or another log file as needed.
+
+---
+
+## License
+
+This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
+
+---
+
+## Acknowledgments
+
+- [Calibre](https://calibre-ebook.com/) for providing an excellent e-book management tool.
+- [Docker](https://www.docker.com/) for simplifying containerization.
+- [Docker Compose](https://docs.docker.com/compose/) for streamlining multi-container setups.
+
+---
+
+## Contributing
+
+Contributions are welcome! Please open an issue or submit a pull request for any improvements or bug fixes.
+
+---
+
+## Support
+
+For support, please open an issue in the repository or contact the maintainer directly.
+
+---
+
+This README provides a comprehensive guide to setting up and running the Calibre Docker container with Docker Compose. For more detailed instructions, refer to the Dockerfile, `docker-compose.yml`, and associated scripts.

+ 135 - 0
recipe/oriental_daily_pure.recipe

@@ -0,0 +1,135 @@
+# -*- coding: utf-8 -*-
+import string, re
+#import HTMLParser 
+from html.parser import HTMLParser
+from calibre import strftime
+from calibre.web.feeds.recipes import BasicNewsRecipe
+from calibre.ebooks.BeautifulSoup import BeautifulSoup
+
+class OrientalDailyPure(BasicNewsRecipe):
+
+    title       = 'Oriental Daily - '  + time.strftime('%d %b %Y')
+    __author__  = 'Larry Chan'
+    description = 'Oriental Daily, Hong Kong'
+    publication_type = 'newspaper'
+    language    = 'zh'
+    timefmt = ' [%a, %d %b, %Y]'
+    masthead_url = 'http://orientaldaily.on.cc/img/v2/logo_odn.png'
+    #cover_url = 'http://orientaldaily.on.cc/cnt/news/' + time.strftime('%Y%m%d') + '/photo/' + time.strftime('%m%d') + '-00174-001k1.jpg'
+    cover_url = 'https://orientaldaily.on.cc/asset/main/%s/photo/337_sectMain.jpg' % time.strftime('%Y%m%d')
+    #print ("cover %s" % cover_url)
+    delay = 0
+
+    no_stylesheets = True
+    extra_css = 'h1 {font: sans-serif large;}\n.byline {font:monospace;}'
+   
+
+#    keep_only_tags    = [
+#                       dict(name='h1'),
+#                       dict(name='a'),                                  
+#                       dict(name='img'),                                  
+#                       dict(name='div'),                                  
+#                       dict(attrs={'div': 'content'})                                  
+#                        ]
+
+    #dict(name='p', attrs={'class':['photoCaption','paragraph']})
+    #remove_tags = [dict(name=['script', 'input'])]
+    HTMLParser.attrfind = re.compile(
+                        r'\s*([a-zA-Z_][-.:a-zA-Z_0-9]*)(\s*=\s*'
+                        r'(\'[^\']*\'|"[^"]*"|[^\s>^\[\]{}\|\'\"]*))?') 
+    
+
+
+
+    
+    def parse_index(self):
+
+
+       	def extract_text(tag):
+            return str(tag.contents[0]).replace('<em>', '').replace('</em>', '')
+ 
+
+        
+        def scrap_feed(feed):
+            f_url = '%s%s' % (urlRoot, feed[0])
+            print ('feed url %s ' % f_url)
+            soup = self.index_to_soup(f_url)
+            # verify a section is available for download on the day this script is run.
+            # skip a section if unavailable   
+            # for instance, finance section is unavailable on Sunday, so is "lifestyle"
+            try:
+               articles = soup.findAll('div', 'sectionList')[0].findAll('li')
+            except:
+               print ('--- this section [%s] is not available today ---' % feed[1]) 
+               raise Exception ('--- this section [%s] is not available today ---' % feed[1]) 
+
+
+ 		
+
+            articles = map(lambda x:{'url': '%s%s' % (urlRoot, x.a['href']), 
+                            'title': x.findAll('div', attrs={'class' : 'text'})[0].text, 
+                            'date': strftime('%a, %d %b'),
+                            'description': x.findAll('div', attrs={'class' : 'text'})[0].text,
+                            'content': ''}, articles)
+            ans = []
+            for article in articles:
+                ans.append(article)    
+            return ans 
+               
+
+        urlRoot = 'https://orientaldaily.on.cc'
+        url = urlRoot 
+        soup = self.index_to_soup(url)
+        #lookups = ['news', 'china_world', 'finance', 'lifestyle', 'sport']
+        lookups = ['news', 'china_world', 'finance', 'entertainment', 'lifestyle', 'adult', 'sport']
+        # no finanical news on Sunday
+        #if time.strftime('%w') == '0':
+        #   lookups.remove('finance') 
+
+        feeds = soup.findAll('ul', 'menuList clear')[0].findAll('li', attrs={'section':lookups})
+        feeds = map(lambda x: (x.a['href'], x.text), feeds)
+        feeds = list(feeds)
+
+        print ('----------------------- The feeds are: %s' % feeds)
+        ans = []
+        for e in feeds:
+            try:
+               print ('e[1] is: %s | %s\n' % (e[1], e[0]))
+               ans.append((e[1], scrap_feed(e)))
+            except Exception as e:
+               print('while processing feed: %s' % e) 
+               continue
+
+        print ('############')
+        print (ans)
+        return ans
+  
+
+
+
+    
+
+    def preprocess_html(self, soup):
+         
+        print('((((( begin article ))))')
+        try:
+            #print(soup)
+            html = str(soup.find('h1'))  + ''.join(str(t) for t in soup.findAll('div', 'content'))	
+            # download photo
+            pic = soup.find('div', 'paragraph photoParagraph')
+            #print (pic)
+            if pic != None:
+               html += '<a href="%s"><img src="%s"></img></a>' % (str(pic.a['href']), str(pic.img['src'])) 
+            #print('>>>>>>>>>>>>>>> %s' % html)
+            return BeautifulSoup(html) 
+        except Exception as e:
+            print (e)
+            print('other article...')	
+        print('((((( end article ))))')
+        return soup 
+
+
+    def get_browser(self, *args, **kwargs):
+        br = BasicNewsRecipe.get_browser(self)
+        br.set_header('User-Agent', value='Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36') 
+        return br

+ 30 - 0
recipe/upkindle.sh

@@ -0,0 +1,30 @@
+#!/bin/sh
+TZ=Asia/Hong_Kong
+TODAY=`date +"%Y%m%d"`
+ROOTPATH=/calibre
+RECIPEPATH=$ROOTPATH/recipe
+MOBIPATH=/news
+OPTIONS="--output-profile kindle_pw"
+echo $RECIPEPATH
+#
+#
+#  convert epub to mobi 
+#
+ebook-convert "$RECIPEPATH/oriental_daily_pure.recipe" $MOBIPATH/$TODAY-orient.mobi $OPTIONS
+#
+#
+#  download news and save output to epub
+#
+#
+ebook-convert $MOBIPATH/$TODAY-orient.mobi $MOBIPATH/$TODAY-orient.epub 
+#
+#
+#  send book to kindle
+#
+calibre-smtp  --subject "oriental news $TODAY" --attachment $MOBIPATH/$TODAY-orient.epub --relay hwsmtp.exmail.qq.com --port 465 --username vortify-lc@algometic.com --password "adverS@1e" --encryption-method SSL vortify-lc@algometic.com larry1chan11@kindle.com ""
+#calibre-smtp  --subject "oriental news $TODAY" --attachment $MOBIPATH/$TODAY-orient.epub --relay hwsmtp.exmail.qq.com --port 465 --username vortify-lc@algometic.com --password "ins@neS00n" --encryption-method SSL vortify-lc@algometic.com larry1chan11@kindle.com ""
+
+#
+#  rm mobi file
+#
+rm $MOBIPATH/$TODAY-orient.mobi