Table of Contents

Preparation

Check Environment

Check your environment. Ubuntu 22.04.4 LTS is only supported.

lsb_release -a
# Ubuntu 22.04.4 LTS

uname -m
# x86_64

Create Zip for AWS Lambda Layer

Clone my github project arxiv-bot.

git clone git@github.com:kktsuji/arxiv-bot.git
cd arxiv-bot

Python version must be 3.12.3.

python --version
# Python 3.12.3

Create new directory and install python packages to the directory. Then create zip file. The zip file name must be python.zip. This file will used for aws lambda layer.

mkdir python
pip install -U pip
pip install -r requirements.txt -t ./python
zip -r python.zip ./python

Webhook Settings

Get webhook url of the service you want to notify.

(Optional) OpenAI Settings

Get OpenAI API Key if you want to summary papers.

Note: The OpenAI API is chargeable.

AWS Lambda Settings

Lambda Layer

Visit AWS Lambda Console and “Create layer”.

Upload python.zip file and fill “compatible architectures” of your environment that created the zip file. And set Python 3.12 to Runtime.

Note: If the file name is not python.zip, lambda function will fail to import third-party python modules.

img

Lambda Function

Visit “Create function” in AWS Lambda Console.

Select “Author from scratch” and fill forms correctly.

img

After creating function, visit “Add a layer” page.

img

Select “Custom layers” and the layer you created and its version.

img

Copy entire code of main.py in arxiv-bot project, and past it to “Code” > “Code source” > lambda_function.py.

And push “Deploy” button.

img

Push “Test” and fill the “Configure test event”.

The “Event JSON” must follow this format (these parameters are used for only test).

{
  "webhook_url": "https://YOUR_WEBHOOK_URL",
  "keywords": "keyword1,keyword2,keyword3",
  "categories": "cs.AI,cs.CV,cs.LG,eess.IV",
  "openai_api_key": "YOUR_API_KEY"
}
Key Description
webhook_url The webhook url such as Slack, Teams, and other service APIs.
keywords Keywords used in queries for arXiv searches.
Each keyword is separated by a comma with no spaces.
Keywords are used to search titles and abstracts and are searched for with “or”.
For example, if the value “keyword1,key word2” is specified, paper containing keyword1 and papers containing ‘key word2’ will be displayed as search results (if a keyword contains spaces, single quotation marks are be used).
categories Categories used in queries for arXiv searches.
This follows the same rule of keywords (separated by comma without space, searched with “or”). And spaces are removed.
For more details, see arXiv Category Taxonomy.
OPENAI_API_KEY (Optional) OpenAI API Key.
If you do not use the paper summarization function, please leave blank like bellow:
"openai_api_key": ""

img

Save configuration and execute test.

Execution results can be checked in “Code source” or at the service of the webhook URL you wrote.

img

img

Once the operation is checked, note the “Function ARN” of the Lambda function.

img

AWS IAM Settings

Create policy to invoke Lambda function.

IAM Policy

Visit AWS IAM > Policies > “Create policy”.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "VisualEditor0",
            "Effect": "Allow",
            "Action": "lambda:InvokeFunction",
            "Resource": "arn:aws:YOUR-LAMBDA-FUNCTION-ARN"
        }
    ]
}

IAM Role

Visit AWS IAM > Roles > “Create role”.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "admitEventBridge",
            "Effect": "Allow",
            "Principal": {
                "Service": "scheduler.amazonaws.com"
            },
            "Action": "sts:AssumeRole"
        }
    ]
}

Then attach the policy to the role.

AWS EventBridge Settings

Setup to execute Lambda functions at a fixed time each day.

EventBridge Schedule

Visit AWS EventBridge Console and “Create schedule”.

img

Fill the forms. Please confirm the timezone setting and the cron setting to execute the Lambda function at the correct time.

img

img

img

Correctly set json parameters to obtain arXiv search results (these parameters are used for daily queries).

{
  "webhook_url": "https://YOUR_WEBHOOK_URL",
  "keywords": "keyword1,keyword2,keyword3",
  "categories": "cs.AI,cs.CV,cs.LG,eess.IV",
  "openai_api_key": "YOUR_API_KEY"
}

img

img

Fill remaining forms.

img

Set IAM Policy you created to Permission > Execution role > Use existing role > Role name.

img

Create schedule.

Settings completed!