Help:Cloud Services introduction: Difference between revisions

From Wikitech
Content deleted Content added
m revert edits by <username hidden>
Tag: Rollback
m Reverted edits by NoobThreePointOh (talk) to last revision by Majavah
Tag: Rollback
Β 
(17 intermediate revisions by 10 users not shown)
Line 1: Line 1:
{{See|wmflabs.org and wmcloud.org redirect here. You might be looking for the [[:Category:Toolforge tools|Lists of Toolforge tools]] or the [[toolforge:openstack-browser/project/|List of Cloud VPS projects]].}}

[[File:What is Cloud Services? poster.pdf|alt=A Poster showing some Wikimedia Cloud services statistics and services|thumb|upright=1.3|Poster-format overview]]
[[File:What is Cloud Services? poster.pdf|alt=A Poster showing some Wikimedia Cloud services statistics and services|thumb|upright=1.3|Poster-format overview]]
'''Wikimedia Cloud Services (WMCS)''' provides tools, services, and support for technical collaborators who want to contribute to Wikimedia software projects. Use Cloud Services to host your software tools for the [[meta:Special:MyLanguage/Wikimedia movement|Wikimedia movement]], without charge.


== What can you do with Cloud Services? ==
[[Wikimedia_Cloud_Services_team|'''Wikimedia Cloud Services''' ('''WMCS''')]] provides tools, services, and support for technical collaborators who want to contribute in Wikimedia software projects. Use Cloud Services to host your software tools for the [[meta:Special:MyLanguage/Wikimedia movement|Wikimedia movement]], without charge.

== Service concepts ==

{{ContentGrid
|content=
{{Colored box
|title = Data as a service
|content = [[#Quarry|Quarry]] and [[#PAWS|PAWS]] empower '''technically curious to advanced users''' to query wiki replicas and create scripts, tutorials, and data visualizations to analyze and improve Wikimedia projects.

See [[#Data Services]] below.
}}
{{Colored box
|title = Platform as a service
|content = [[#Toolforge|Toolforge]] is for '''intermediate to advanced users''' working on tools, bots, webservices that support Wikimedia projects.

See [[#Toolforge]] below.
}}
{{Colored box
|title = Infrastructure as a service
|content = [[#Cloud_VPS|Cloud VPS]] is for '''advanced users''' who need to administer their own servers for Wikimedia operations and software development.

See [[#Cloud VPS]] below.
}}
}}

Are you unsure? Check [[#Which service is right for you?]] below.

== Toolforge ==

[[File:Toolforge_logo.svg|right|frameless|60px|alt=Toolforge|link=Portal:Toolforge]]

''[[Portal:Toolforge|Toolforge]]'' is one of the projects hosted by Wikimedia Cloud VPS. It is a shared hosting (platform as a service) environment for volunteers to develop and run [https://admin.toolforge.org/tools tools], [[mw:Manual:Creating a bot|continuous bots]], web services, scheduled jobs, and data analysis.

To use Toolforge you will need some programming knowledge, an understanding of Unix command line, and version control via [[mw:Gerrit|Gerrit]] and Git.

Users of the Toolforge project create so-called "tool" accounts (technically ''service groups''). These accounts allow one or more users to collaborate to manage the software source code, configuration, and jobs for that tool or bot.

The Toolforge administrators manage a pool of virtual servers which provide a shared project hosting environment that can be used by Toolforge users. These resources include [[Help:Toolforge/Web|web servers]], [[Help:Toolforge/Database|databases]] and [[Help:Toolforge#Redis|other data storage]], and a [[Help:Toolforge/Grid|distributed job processing system]]. These services provide a reliable and scalable hosting environment for volunteers to develop and operate their tools and bots.

For additional documentation and help with Toolforge, see [[Portal:Toolforge]].

== Cloud VPS ==

[[File:Wikimedia_Cloud_Services_logo.svg|right|frameless|60px|alt=Cloud Services|link=Portal:Cloud VPS]]

''[[Portal:Cloud VPS|Cloud VPS]]'' (Virtual Private Server) is a [[:w:Cloud computing|cloud computing]] environment powered by [[:w:OpenStack|OpenStack]]. It offers collaboratively owned collections of virtual private servers. You can use this infrastructure to create and maintain open source software projects that help the [[:meta:Special:MyLanguage/Wikimedia movement|Wikimedia movement]].

The environment includes access to a variety of data services. Cloud VPS allows developers and system administrators to try out improvements to Wikimedia infrastructure (including MediaWiki), power research and analytics, and host projects that are not viable in the Toolforge environment.

Cloud VPS is for the advanced users to get involved in Wikimedia operations and software development. Cloud VPS contains [[toolforge:openstack-browser/project/|many projects]], each of which uses one or more instances.


=== How is Cloud VPS organized? ===


=== Host tools on Wikimedia servers ===
Cloud VPS is divided into projects. Each project has separate members and administrators who can create and maintain virtual machines ("instances") for use by that project. Each project can have its own access policies, DNS records, etc.
{{anchor|Toolforge|reason=Old section name.}}


Tools and bots make it easier to edit and maintain Wikimedia projects. For developers who support Wikimedia projects by developing tools and bots, '''[[Help:Toolforge | Toolforge]]''' provides the following features:
=== Who gets a Cloud VPS project? ===
* Free, reliable, and scalable shared hosting, including web servers, databases and other data storage
* A distributed job processing system
* Support for multiple users to collaboratively maintain and manage tools


To use Toolforge you need:
A VPS project can be granted for any Wikimedia-adjacent work that cannot be accomplished using other WMCS offerings. Cloud VPS instances go through a request and approval processes, and large resource requests (e.g. dozens of gigabytes of ram or hundreds of gigabytes of disk space) will receive extra scrutiny. Before requesting, explore whether Toolforge or another service will adequately meet your needs.
* Some programming knowledge
* An understanding of Unix command line


To get started, visit [[Help:Toolforge]]. Or, [https://developer.wikimedia.org/build-tools/ learn more about creating bots].
Because Cloud VPS projects are typically resource-intensive and pose some long-term security risks, all projects must have one or more active maintainers. Maintainers must have an active phabricater account, must subscribe to the [https://lists.wikimedia.org/postorius/lists/cloud-announce.lists.wikimedia.org/ cloud-announce] mailing list, and must respond to queries and requests for action on the part of WMCS staff and admins. Instances are not permanent and are reviewed periodically for potential deletion/removal.


=== What is a Cloud VPS project? ===
=== Run scripts and visualize data ===
{{anchor|PAWS|reason=Old section name.}}
'''[[PAWS]]''' is a Jupyter notebook installation hosted by Wikimedia. PAWS notebooks can be used for creating tutorials, running live code, creating data visualizations, running basic bots, and more.


A single PAWS notebook is maintained by a single user, but they can be downloaded and forked by other users. To use PAWS you need only a Wikimedia login and a web browser. Knowledge of Python is helpful, but not required.
A project is a unit of privilege separation inside the Cloud VPS environment. Each project has separate management of membership, virtual machines, HTTPS proxies, firewall rules, etc. Examples of projects include [[Portal:Toolforge|Toolforge]] and the [[Nova_Resource:Deployment-prep|Beta Cluster]].


===How does Cloud VPS work? ===
=== Administer servers for software development ===
{{anchor|Cloud VPS|reason=Old section name.}}
Open source software projects help the Wikimedia movement by improving core infrastructure (like MediaWiki), powering research and analytics, and supporting Wikimedia operations and software development. For advanced projects that aren't viable in the Toolforge environment, '''[[Help:Cloud_VPS | Cloud VPS]]''' (Virtual Private Server) provides the following features:
* Free cloud computing environment, powered by [[:w:OpenStack|OpenStack]]
* Collaboratively-owned collections of virtual private servers, storage, firewall, and HTTPS proxy resources to projects
* Access to a variety of data services
* Freedom to install packages not provided by Debian or the Wikimedia Foundation
To use Cloud VPS, you need:
* An open source project that isn't viable in the Toolforge environment, or can't be accomplished using other WMCS offerings
* One or more active project maintainers who meet basic requirements
* Advanced programming knowledge
* Advanced experience with Unix command line
* The ability to administer your own servers and manage your project's applications, data, runtime, middleware, and operating systems


To get started, visit [[Help:Cloud VPS]].
Cloud VPS is a virtualization cluster and hosts various virtual machines (called instances) using [http://www.openstack.org/software/openstack-compute OpenStack Compute]. This is slightly different from your normal servers that you ssh to (i.e. Toolserver), as virtual machines do not exist physically, but reside inside a much bigger machine called the host machine. More details about the physical setup of Cloud VPS can be found under [[Portal:Cloud VPS/Infrastructure]].


== What is the difference between Cloud VPS and Toolforge? ==
==== What is the difference between Cloud VPS and Toolforge? ====


Cloud VPS is an [[:w:Cloud_computing#Infrastructure_as_a_service_.28IaaS.29|Infrastructure as a service (IaaS)]] solution. It provides virtual machines, storage, firewall, and HTTPS proxy resources to projects. The members of each individual project are responsible for managing applications, data, runtime, middleware, and operating systems themselves.
Cloud VPS is an [[:w:Cloud_computing#Infrastructure_as_a_service_.28IaaS.29|Infrastructure as a service (IaaS)]] solution. It provides virtual machines, storage, firewall, and HTTPS proxy resources to projects. The members of each individual project are responsible for managing applications, data, runtime, middleware, and operating systems themselves.


Toolforge is a [[:w:Cloud_computing#Platform_as_a_service_.28PaaS.29|Platform as a service (PaaS)]] solution. It provides [[Help:Toolforge/Web|web servers]], [[Help:Toolforge/Database|databases]] and [[Help:Toolforge#Redis|other data storage]], and a [[Help:Toolforge/Grid|distributed job processing system]] as managed services that can be used by tools and their maintainers.
Toolforge is a [[:w:Cloud_computing#Platform_as_a_service_.28PaaS.29|Platform as a service (PaaS)]] solution. It provides [[Help:Toolforge/Web|web servers]], [[Help:Toolforge/Database|databases]], and a [[Help:Toolforge/Grid|distributed job processing system]] as managed services for tool maintainers.


=== Access databases and data dumps ===
== Data Services ==
{{anchor|Data Services|reason=Old section name.}}
{{Note|What is a wiki replica? What's in the dumps? Learn the basics of Wikimedia open data and how to access all available datasets at [[meta:Research:Data|Research:Data]].}}


==== Access wiki databases for tool development====
''[[Portal:Data Services|Data Services]]'' are a collection of products which provide access to copies of Wikimedia's production wiki databases (with private information redacted) and access to [[meta:Special:MyLanguage/Data dumps|Wikimedia data dumps]]. Use data services to create replicas of the production databases and other data for analysis and experimentation.


Tools and software hosted on Toolforge and Cloud VPS can directly access public data dumps and production wiki replicas.
There are also services to interact with data in a web browser: Quarry and PAWS.
* Learn about [[Help:Toolforge/Database | accessing wiki replica databases from a tool account]].
* Learn about [[Help:Shared_storage | accessing dumps through shared storage services for Cloud VPS and Toolforge]].


==== Query wiki replicas and dumps in a browser ====
=== Quarry ===


* '''[[PAWS]]''' provides a Jupyter notebook environment you can use to query wiki replicas and dumps, create interactive graphs, and use APIs for analysis.
[[File:Quarry-logo.svg|right|frameless|200px|alt=Quarry|link=meta:Special:MyLanguage/Research:Quarry]]
* '''[[Superset]]''' and '''[https://quarry.wmcloud.org/ Quarry]''' are web interfaces for querying live replica SQL databases of public Wikimedia wikis.

To use Superset or Quarry you need only a Wikimedia login and a web browser, but you should have a basic understanding of SQL. Learn about [[Help:MySQL_queries|SQL queries]].
''Quarry'' is a public querying interface for [[Help:Toolforge/Database|Wiki Replicas]], a set of live replica [[:w:en:SQL|SQL]] databases of public Wikimedia wikis. Quarry is designed to make running queries against Wiki Replicas easy. Quarry also allows researchers to share and review each other's queries.

Quarry queries are run by individual users. They can be saved and published and forked by other users.

To use Quarry you need only a Wikimedia login and a web browsers. A basic understanding of SQL is recommended. Learn about [[Help:MySQL_queries|SQL queries]].

=== PAWS ===

[[File:PAWS.svg|right|frameless|200px|alt=PAWS|link=PAWS]]

''[[PAWS]]'' is a Jupyter notebook installation hosted by Wikimedia. PAWS notebooks can be used for creating tutorials, running live code, creating data visualizations, running bots using Pywikibot, and more.

A single PAWS notebook is maintained by a single user, but they can be downloaded and forked by other users.

To use PAWS you need only a Wikimedia login and a web browser. Knowledge of Python is helpful, but not required.


== Which service is right for you? ==
== Which service is right for you? ==
Line 111: Line 68:
|+
|+
! rowspan="2" |Activity / Needs
! rowspan="2" |Activity / Needs
!PAWS '''(DaaS)'''
!PAWS
!Superset
!Quarry (DaaS)
!Toolforge (PaaS)
!Toolforge
!Cloud VPS (IaaS)
!Cloud VPS
|-
|-
|Data as a service
|Data as a service
Line 121: Line 78:
|Infrastructure as a service
|Infrastructure as a service
|-
|-
|Write documentation and create tutorials
|Write scripts and visualize data
|βœ”
|βœ”
|
|
Line 127: Line 84:
|
|
|-
|-
|Write queries against wiki replica databases
|Browser based
|βœ”
|βœ”
|
|
|-
|Write queries against replica databases
|βœ”
|βœ”
|βœ”
|βœ”
Line 139: Line 90:
|βœ” via Toolforge
|βœ” via Toolforge
|-
|-
|Access on database dump files
|Access on wiki database dump files
|βœ”
|βœ”
|
|
Line 148: Line 99:
|βœ”
|βœ”
|
|
|βœ” easily
|βœ”
|βœ” if not viable on Toolforge
|βœ” manually
|-
|-
|Run web services
|Run web services
|
|
|
|
|βœ” easily
|βœ”
|βœ” if not viable on Toolforge
|βœ” manually
|-
|-
|Build tools to improve Wikimedia projects
|Build tools to improve Wikimedia projects
|
|
|
|
|βœ” easily
|βœ”
|βœ” if not viable on Toolforge
|βœ” manually
|-
|-
|Schedule or run continuous jobs
|Schedule or run continuous jobs
|
|
|
|
|βœ” easily
|βœ”
|βœ” if not viable on Toolforge
|βœ” manually
|-
|Terminal based
|
|
|βœ”
|βœ”
|-
|-
|Need your own subdomain
|Need your own subdomain
Line 198: Line 143:
|
|
|βœ”
|βœ”
|-
!Platform / Environment
|web browser
|web browser
|terminal
|terminal
|-
|-
!User knowledge
!User knowledge
Line 206: Line 157:
|}
|}


== Get started ==
== Before you start ==


To use Cloud Services products, you must first [[Help:Create a Wikimedia developer account|create a Wikimedia account and a developer account]].
Set up your Toolforge or Cloud VPS projects by following the instructions on [[Help:Getting Started]].


== Learn more ==
{{:Help:Cloud Services communication}}


== Technology stack==

WMCS is a computing ecosystem built on [[w:OpenStack|OpenStack]] and [[w:Kubernetes|Kubernetes]].

== Learn more ==
* [[Portal:Toolforge|Toolforge Portal]] β€” Information about Toolforge and links to help and technical documentation.
* [[Portal:Cloud VPS|Cloud VPS Portal]] β€” Information about Cloud VPS and links to help and technical documentation.
* [[Portal:Data Services|Data Services Portal]] β€” Information about Data Services and links to help and technical documentation.
* See the [[Help:Glossary|glossary]] for detailed definitions of terms which are specific to Toolforge and Cloud VPS.
* 🎬 Video: [https://media.ccc.de/v/36c3-77-wikimedia-cloud-services-introduction Wikimedia Cloud Services introduction] (2019)
* 🎬 Video: [https://media.ccc.de/v/36c3-77-wikimedia-cloud-services-introduction Wikimedia Cloud Services introduction] (2019)
* πŸ“£ Slides: [[commons:File:Introduction_to_Wikimedia_Cloud_Services_-_Wikimania_Hackathon_2019_Stockholm_Sweden.pdf|An introduction to Cloud Services presentation]] (2019)
* πŸ“£ Slides: [[commons:File:Introduction_to_Wikimedia_Cloud_Services_-_Wikimania_Hackathon_2019_Stockholm_Sweden.pdf|An introduction to Cloud Services presentation]] (2019)


{{:Help:Cloud Services communication}}
== Historical information ==

From 2011 until early 2017, ''Wikimedia Cloud Services'' was known as ''Wikimedia Labs''. However, the term ''Labs'' was used for [[Labs labs labs|several different things]].


[[Category:Overviews]]
Since 2017, the former ''Wikimedia Foundation Labs team'' and ''Tool Labs Support team'' merged into the ''[[mw:Wikimedia Cloud Services team|Wikimedia Cloud Services team]]''.
[[Category:Cloud Services]]

Latest revision as of 21:35, 3 February 2024

A Poster showing some Wikimedia Cloud services statistics and services
Poster-format overview

Wikimedia Cloud Services (WMCS) provides tools, services, and support for technical collaborators who want to contribute to Wikimedia software projects. Use Cloud Services to host your software tools for the Wikimedia movement, without charge.

What can you do with Cloud Services?

Host tools on Wikimedia servers

Tools and bots make it easier to edit and maintain Wikimedia projects. For developers who support Wikimedia projects by developing tools and bots, Toolforge provides the following features:

  • Free, reliable, and scalable shared hosting, including web servers, databases and other data storage
  • A distributed job processing system
  • Support for multiple users to collaboratively maintain and manage tools

To use Toolforge you need:

  • Some programming knowledge
  • An understanding of Unix command line

To get started, visit Help:Toolforge. Or, learn more about creating bots.

Run scripts and visualize data

PAWS is a Jupyter notebook installation hosted by Wikimedia. PAWS notebooks can be used for creating tutorials, running live code, creating data visualizations, running basic bots, and more.

A single PAWS notebook is maintained by a single user, but they can be downloaded and forked by other users. To use PAWS you need only a Wikimedia login and a web browser. Knowledge of Python is helpful, but not required.

Administer servers for software development

Open source software projects help the Wikimedia movement by improving core infrastructure (like MediaWiki), powering research and analytics, and supporting Wikimedia operations and software development. For advanced projects that aren't viable in the Toolforge environment, Cloud VPS (Virtual Private Server) provides the following features:

  • Free cloud computing environment, powered by OpenStack
  • Collaboratively-owned collections of virtual private servers, storage, firewall, and HTTPS proxy resources to projects
  • Access to a variety of data services
  • Freedom to install packages not provided by Debian or the Wikimedia Foundation

To use Cloud VPS, you need:

  • An open source project that isn't viable in the Toolforge environment, or can't be accomplished using other WMCS offerings
  • One or more active project maintainers who meet basic requirements
  • Advanced programming knowledge
  • Advanced experience with Unix command line
  • The ability to administer your own servers and manage your project's applications, data, runtime, middleware, and operating systems

To get started, visit Help:Cloud VPS.

What is the difference between Cloud VPS and Toolforge?

Cloud VPS is an Infrastructure as a service (IaaS) solution. It provides virtual machines, storage, firewall, and HTTPS proxy resources to projects. The members of each individual project are responsible for managing applications, data, runtime, middleware, and operating systems themselves.

Toolforge is a Platform as a service (PaaS) solution. It provides web servers, databases, and a distributed job processing system as managed services for tool maintainers.

Access databases and data dumps

What is a wiki replica? What's in the dumps? Learn the basics of Wikimedia open data and how to access all available datasets at Research:Data.

Access wiki databases for tool development

Tools and software hosted on Toolforge and Cloud VPS can directly access public data dumps and production wiki replicas.

Query wiki replicas and dumps in a browser

  • PAWS provides a Jupyter notebook environment you can use to query wiki replicas and dumps, create interactive graphs, and use APIs for analysis.
  • Superset and Quarry are web interfaces for querying live replica SQL databases of public Wikimedia wikis.

To use Superset or Quarry you need only a Wikimedia login and a web browser, but you should have a basic understanding of SQL. Learn about SQL queries.

Which service is right for you?

Activity / Needs PAWS Superset Toolforge Cloud VPS
Data as a service Data as a service Platform as a service Infrastructure as a service
Write scripts and visualize data βœ”
Write queries against wiki replica databases βœ” βœ” βœ” βœ” via Toolforge
Access on wiki database dump files βœ” βœ”
Write and run bots βœ” βœ” βœ” if not viable on Toolforge
Run web services βœ” βœ” if not viable on Toolforge
Build tools to improve Wikimedia projects βœ” βœ” if not viable on Toolforge
Schedule or run continuous jobs βœ” βœ” if not viable on Toolforge
Need your own subdomain βœ” βœ”
Work with co-maintainers and co-admins βœ” βœ”
Install packages not provided by Debian or the Wikimedia Foundation βœ”
Administer your own virtual server βœ”
Platform / Environment web browser web browser terminal terminal
User knowledge curiousβ€”advanced curiousβ€”advanced intermediateβ€”advanced advanced

Before you start

To use Cloud Services products, you must first create a Wikimedia account and a developer account.

Learn more

Communication and support

Support and administration of the WMCS resources is provided by the Wikimedia Foundation Cloud Services team and Wikimedia movement volunteers. Please reach out with questions and join the conversation:

Discuss and receive general support
Stay aware of critical changes and plans
Track work tasks and report bugs

Use a subproject of the #Cloud-Services Phabricator project to track confirmed bug reports and feature requests about the Cloud Services infrastructure itself

Read stories and WMCS blog posts

Read the Cloud Services Blog (for the broader Wikimedia movement, see the Wikimedia Technical Blog)