Toolhub/Decision record

From Meta, a Wikimedia project coordination wiki

This decision record tracks technical and design decisions made for the Toolhub project. Decisions documented here are intended to help new folks joining the project catch up on what has happened prior to their joining. When possible, links to email and phabricator discussions should be included (citations) for summary information recorded here.

API first design[edit]

A fiat decision was made early in the conceptual design of Toolhub that it should be build using an "API first" design. All actions possible via an native UI for Toolhub should also be available via an HTTP API. Ideally the native UI will use the API itself, but this is not strictly required. Business logic exposed by the API and UI should always be consolidated however. This means that any identical action possible though both the API and the native UI should ultimately invoke the same controller logic on the server side.

UI framework[edit]

Tracked in Phabricator:
Task T261029 resolved

Toolhub will use Vue.js for any advanced javascript user interface components. This decision is in alignment with the selection of Vue.js by the Frontend Architecture Working Group for future MediaWiki projects.

Multiple Vue.js frameworks have been reviewed with four being compared in depth. Ultimately, the Vuetify framework was selected based on all the criteria for selection, including RTL and accessibility support, availability of UI components needed for Toolhub, support of responsive development, project popularity, and availability of long-term support releases.

Localization support[edit]

MediaWiki has some of the most robust tools for localizing UI messages of any software project. The Wikimedia community is global and expects us to be better than average at localization. This combination sets a high bar for supporting i18n and l10n in Toolhub.

Backend messages[edit]

Frontend messages[edit]

We use vue-i18n with a custom banana-i18n message parser to mark messages for translation in the frontend.

The custom parser makes us deviate from some vue-i18n conventions for the convenience of our translators:

  • Always use $t(...) to render messages, even for plurals.
  • Always use positional placeholders in messages ($1, $2, etc) and pass message parameters as an ordered array rather than as an object/dictionary at runtime.
  • Use banana-i18n's {{PLURAL:$1|pluralform1|pluralform2|...}} syntax in messages where plural forms are required. See the banana-i18n docs for more details.
  • New messages must be added to both the en.json and qqq.json message files.
  • The qqq descriptions for messages should include a "Parameters: ..." section describing any positional parameters used in the message so that translators have a better chance at understanding which word forms to use.
  • Use our custom <I18nHtml></I18nHtml> component to render messages with HTML positional parameters. This component is roughly equivalent to vue-i18n's <i18n></i18n> functional component.

RTL language support[edit]

Tracked in Phabricator:
Task T261020 resolved
  • Use cssjanus (or a similar tool) to produce "flipped" layouts
    • In practice, we are using the built-in $vuetify.rtl direction flag from Vuetify to perform flipping.
  • Do not use inline css styles, instead use class or id based rules which can be directionaly flipped when appropriate
    • When using Vuetify CSS helper classes for spacing, never use the right and left directions. Use the start and end directions instead. The s and e rules flip when the $vuetify.rtl direction flag changes.
  • Have native speakers of RTL languages help check for RTL display problems
  • Some good high level advice at https://material.io/design/usability/bidirectionality.html

Translations for dynamic content[edit]

Tracked in Phabricator:
Task T259838 resolved
  • Is there a way to feed selected strings from the user generated content to translatewiki.net (TWN) and get translations back?
    • Yes, we have identified 2 ways this might be accomplished: via a git repo of exported strings; via Action API calls to TWN. A "tech spike" is needed to provide a better idea of the feasibility of an API driven approach. We would need to make both TWN and ourselves comfortable with the latencies and load that an API based approach would place on TWN.
  • Is there a robust Django translation app that can be reused to provide in-app translations?
    • A number of Django apps related to dynamic translations were reviewed in phab:T259838#6460927. These should be revisited to see if any are directly usable once tech spikes have clarified the method that will be used to send messages to TWN for translation and load the resulting translations. A system like django-translations could be used with either an API or git dump/load system. django-vinaigrette might be usable as part of a git dump/load system.
  • Generally we should try to find controlled vocabularies that can be translated using normal TWN integrations as much as possible.

Content moderation support[edit]

Tracked in Phabricator:
Task T261023 resolved

Where there is free form text content, there will be vandalism. This is an Internet truism that Wikimedians are well aware of. MediaWiki includes many components to help support patrolling content submissions. Toolhub will need systems for this as well, but what systems? How can we make the process of patrolling Toolhub feel friendly to folks who are used to doing content patrolling with MediaWiki?

General assumptions
  • All edits will be made by authenticated users. No anon edits will be allowed.
  • Authentication will be tied to Wikimedia OAuth and by extension SUL accounts.
  • Content that can be edited will have various levels of "protection" ranging from any authenticated user can edit to only the original content creator (or an admin) can edit.
Everyone
  • View edit history for a toolinfo record (like action=history in MediaWiki)
  • View edit history for an individual editor (like Special:Contributions in MediaWiki)
  • View edit history for all toolinfo records (like Special:RecentChanges in MediaWiki)
  • View audit log for administrative actions, possibly partially redacted depending on action (like Special:Log in MediaWiki)
Authenticated users
  • Undo an edit (like action=edit&undo=<revid> in MediaWiki)
  • Revert content to a prior good edit (not sure that there is a MediaWiki exact match for this)
Patrollers
  • Mark an edit as reviewed/patrolled (like action=markpatrolled in MediaWiki)
  • Work queue/edit history filter showing edits that have not been reviewed (like the ! marker in Special:RecentChanges)
Oversighters
  • Suppress an edit (like Special:RevisionDelete in MediaWiki)
Global CheckUsers
  • Request that backend administrators with access to non-public activity log information gather information on IP addresses involved in an edit, or edits made from a range of IP addresses.
    • Building a full replacement for Extension:CheckUser is out of scope at this point. This will be treated more like similar investigations involving Phabricator or Gerrit.

Visual style[edit]

Colors[edit]

Toolhub adheres to the Wikimedia's visual style guidelines for colors. See below a list of colors and what they are used for.

Category Vuetify theme name Style guide name Hexcode Use
Accent

Emphasizes an action or highlights key information.

primary accent50 #36c Used as the background color in the navigation bar, buttons, selected rows in a table, charts, etc.
Base

Different variations of base colors are used for a wide variety of content areas.

secondary base10 #202122 Used as the background color navigation drawer, buttons with secondary actions, etc.
base20 #54595d Used as the background color in icons.
base80 #e0e0e0 Used as the background color in tool cards and lists.
accent base90 #f8f9fa Used as the background color in buttons for page navigation, toolbars, tables, etc.
base100 #fff Used as the text color for darker backgrounds.
Utility error red50 #d33 Used as the background color in alert components to indicate error.
success green50 #00af89 Used as the background color in alert components to indicate success.
warning yellow50 #fc3 Used as the background color in alert components to indicate warning.
info yellow30 #ac6600 Used as the background color in alert components to indicate general information.

CC0 Content license[edit]

Tracked in Phabricator:
Task T288832 resolved

Toolinfo.json data and other structured data collected in Toolhub is licensed under the Creative Commons CC0 Dedication (CC0). Re-users are encouraged to provide attribution by linking back to Toolhub, but this is not required for license compliance. It is the responsibility of maintainers of externally hosted toolinfo.json data files to ensure that their contributors are aware of the CC0 licensing requirement. Toolinfo data copied from Wikimedia content wikis or other documentation under a non-CC0 license should restrict the copied description content to 50 words/250 characters to limit claims of potential copyright license obligations, like attribution. The word/character limit only applies to copied content, and original CC0 descriptions may be any length.

Toolinfo data is primarily factual information which is not typically granted copyright protection. One notable exception is the description field which allows a relatively large amount of freeform prose content. This description, especially when copied from a content wiki (like English Wikipedia) where a different free content license is in use, could be considered copyrighted material rather than a fair use quotation by some. We discussed extending the toolinfo.json schema to include an optional license and author notation just for the description field to better support this likely use case of copying tool descriptions from a content wiki into the Toolhub catalog. Ultimately we felt that introducing this level of flexibility in content licensing would cause more difficulty for re-users of toolinfo data than is warranted. Instead we will adopt a CC0 dedication for all data, to enable it to be easily reused.

Content ownership/modification model[edit]

Toolinfo records stored in Toolhub have a concept of "ownership". The owner of a record is the user who created it within Toolhub. That act of creation could be direct use of the API to upload the initial record, use of the Toolhub web UI to create the initial record, or submitting a URL which when visited by the crawler creates a new record.

Toolinfo records imported by crawling a toolinfo.json URL can only be changed by changing the external data. We made this choice to remove the difficult problem of 3-way merging changes made via the API and changes made to the crawled source material.

A toolinfo record created directly via the API (and UI since it is just an API client) can only be edited by the user who created the record (as reported by the created_by field in the GET /api/tools/{name}/ response), Administrators, and Oversighers. Admins and Oversighters are generally only expected to edit records created by others to remove or suppress "problematic" content.

Each toolinfo record can also have an "annotations" layer of data. Annotations have an open editing policy to allow the Wikimedia community to help improve existing records through collaborative editing. Some annotations mirror fields from the "core" toolinfo.json specification. When both the core and annotation data for a give field are populated, the Toolhub native user interface will display the core data rather than the annotation data. This has been chosen as a compromise between the conflicting desires for tool maintainers to have ultimate control of their data records and for the community to be able to improve the documentation for tools.

Taxonomy v2[edit]

After a round of community feedback and input, we made the following decisions about which categories and values to implement in the first productionized version of the Toolhub taxonomy:

Revise the Tasks attribute values:

  • Remove "Creating or uploading content"
  • Add "Creating new content"
  • Rename "Generating and recommending content" to "Recommending content"
  • Add "Uploading or importing"
  • Rename "Editing" to "Editing or updating"
  • Remove "Patrolling"
  • Add:
    • Identifying policy violations
    • Identifying spam
    • Identifying vandalism
    • Patrolling recent changes
    • Warning users

Revise the Content types attribute values:

  • Add additional level of hierarchy to group content types and enable both broad or specific values to be applied.
  • Remove "Files".
  • Split "Maps" and "Geographic Data"
  • Split "Books" and "Bibliographic Data"
  • Rename "Audio or sound files" to "Audio"