Subject FAQs

Topics

Articles in most services are identified with a topic selected from a controlled vocabulary that is defined by PA. Topics are the identifier of the story category.

Articles are also marked with a keyword, and the topic/keyword pairing forms the name of the story which will remain consistent for the lifetime of the story. The keyword is entered by the editorial team and this can be any text value. A story with more than one keyword is a sidebar story, for example, Politics Brexit Nurses is a sidebar story to Politics Brexit.

Different controlled vocabularies are used for the services as follows:

Newsuses news topics
Sportuses sport topics
Entertainmentuses news topics
Lifestyleuses lifestyle topics
Viraluses news topics
Sci Techuses news topics
Financeuses news topics
Motoringdoes not use topics currently
Real Lifedoes not use topics currently

How do the 'patopic' and 'pakeyword' profiles work?

Topics are applied from a pre-defined list designed to cover all possible subject areas for stories. Each story created is assigned a main topic and may also be assigned further topics.

Keywords are freetext words used in conjunction with a topic to describe the content of a story. All articles about a given news event share a topic / keyword combination. Stories that are related to the main story may contain a second keyword.

An example would be:

POLITICS Cameron

to refer to a politics story about David Cameron.

Another example would be:

EGYPT Air Trump

to refer to an article containing Donald Trump's reaction to the main 'EGYPT Air' story.

Here is an example of how to retrieve all articles for a specific news event:

curl \
  -H "Accept: application/json" \
  -H "apikey: <API KEY>" \    
  https://content.api.pressassociation.io/v1/item?subject=patopic:politics&subject=pakeyword:cameron

How does the 'paterritory' profile work?

Territories are physical locations that are applied to stories to indicate where an event took place. This category is related to a legacy distribution channel for PA’s wire services, and ‘paterritory’ maps to specific category codes utilised on the Mediadirect service. If desired it can be used to identify a UK or Irish territory relevant to a story. Further geographical information will be available via the API at a later date.

Current options are:

  • International
  • UK
  • Europe
  • USA
  • Australia

Here is an example of how to retrieve all articles set in a particular territory:

curl \
  -H "Accept: application/json" \
  -H "apikey: <API KEY>" \    
  https://content.api.pressassociation.io/v1/item?subject=paterritory:uk:england:north-west

How does the 'pareader' profile work?

Reader type is applied to a story to indicate that its target audience is:

  • male
  • female
  • parent

Not all stories will be tagged with a reader type.

Here is an example of how to retrieve all articles aimed at a specific reader type:

curl \
  -H "Accept: application/json" \
  -H "apikey: <API KEY>" \    
  https://content.api.pressassociation.io/v1/item?subject=pareader:parent

How does the 'tag' profile work?

Free text tags are utilised for adding extra metadata to a story, commonly used for adding people, places, and organisations. Tags are applied to stories during creation in order to aid search and navigation. They indicate what the story is about and are often (but not always) people, places, or organisations. Examples include 'Boris Johnson', 'Pensions Secretary', 'Cancer Research', 'BBC', 'Birmingham', 'Midlands'.

Tags are also used to mark specific events or themes, and these are marked consistently by PA journalists. For example, any stories relating to the Oscars 2018 will all be marked ‘tag:oscars-2018’. Please contact PA if you would like to know what tag or tags will be used for a particular event or theme

If a story is deemed to have graphic or content not suitable for all audiences it will be marked with a ‘Content Warning’ tag, allowing customers to filter this out using the API query.

Tags can also be used to create exclusions based on a customer’s API key. This will prevent stories containing particular tags being returned in API query results. Please contact our Customer Services team to discuss further.

Here is an example of how to retrieve all articles with a specific tag or set of tags:

curl \
  -H "Accept: application/json" \
  -H "apikey: <API KEY>" \    
  https://content.api.pressassociation.io/v1/item?subject=tag:david-cameron

pastorytype

Story type is a further sub-classification of stories based on the content or format of the article. Story type can be used to identify and present certain articles in a specific location or styling within a customer’s product.

curl \
  -H "Accept: application/json" \
  -H "apikey: <API KEY>" \    
  https://content.api.pressassociation.io/v1/item?subject=subject=pastorytype:viral

Sport categorisation

Sport articles are also marked with a specific sport categorisation using a controlled vocabulary. The ‘sport’ and the ‘patopic sport’ lists are equivalent; both are offered as the ‘sport’ category is the preferred classification for API users whereas patopic is used to feed internal PA processes.

In ATOM and RSS sport is marked like this: category term="sport:xxxxx”
In JSON sport is marked like this: sport:xxxxx

Football

Articles relating to football can have additional structured metadata as set out below. Stories will also have unstructured text-based tags for players, managers, organisations or any other relevant metadata.

Football clubs
Most football stories are categorised by relevant team or teams.
These teams map to a defined list of team IDs that correlate to PA’s Football API.
A full list of teams is available on request from your Account Manager or the onboarding team.

Football Competitions
Where relevant, stories are also classified by football competition.
These have a defined list of competitions that correlate to PA’s Football API.

Match IDs

Where stories are related to specific fixtures, they will be tagged with a specific match ID that also corresponds with PA’s Football API and related football data content.
This is to enable customers to build rich event pages or sections combining editorial and data content. PA’s onboarding team can advise further on options to link these content sets.

Venues
When a match ID is present, stories are also automatically tagged with the venue for the fixture as a text-based tag.

Example
Please see examples below of the football-specific metadata set:

<category term="club:pafootball:19" scheme="urn:pa:subject:club" label="Tottenham Hotspur"/>
<category term="competition:pafootball:100" scheme="urn:pa:subject:competition" label="Premier League"/>
<category term="match:pafootball:3998682" scheme="urn:pa:subject:match" label="Huddersfield vs Tottenham Hotspur"/>
<category term="tag:john-smiths-stadium" scheme="urn:pa:subject:tag" label="John Smith's Stadium"/>
<category term="tag:harry-kane" scheme="urn:pa:subject:tag" label="Harry Kane"/>
[
  {
    "code": "club:pafootball:19",
    "name": "Tottenham Hotspur",
    "profile": "club",
    "scheme": "https://content.api.pressassociation.io/v1/subject",
    "rel": "about"
  },
  {
    "code": "competition:pafootball:100",
    "name": "Premier League",
    "profile": "competition",
    "scheme": "https://content.api.pressassociation.io/v1/subject",
    "rel": "about"
  },
  {
    "code": "match:pafootball:3998682",
    "name": "Huddersfield vs Tottenham Hotspur",
    "profile": "match",
    "scheme": "https://content.api.pressassociation.io/v1/subject",
    "rel": "about"
  },
  {
    "code": "tag:john-smiths-stadium",
    "name": "John Smith's Stadium",
    "profile": "tag",
    "scheme": "https://content.api.pressassociation.io/v1/subject",
    "rel": "about"
  },
  {
    "code": "tag:harry-kane",
    "name": "Harry Kane",
    "profile": "tag",
    "scheme": "https://content.api.pressassociation.io/v1/subject",
    "rel": "about"
  }
]

paselection

The paselection value is a descriptor of content type to allow API users to select or filter content for their application.

The available values are paselection:curated, paselection:featured and paselection:enrichedonly.

Paselection:curated denotes content that is part of the PA Now product.

Paselection:featured denotes content that has been marked as an editorial highlight, selected by PA’s editorial team. This permits API customers to use these highlights more prominently in a website or other product.
Highlights are updated on a rolling basis as relevant for each service.

Paselection:enrichedonly indicates a story that is only available through PA Ready and is a format or content type particularly suited to web and mobile use.