Help:Cloud Services introduction: Difference between revisions

From Wikitech
Content deleted Content added
Aklapper (talk | contribs)
m s/http/https/
Aklapper (talk | contribs)
Rewrite page to be a proper overview page with proper funneling, plus better info how to pick the service that suits you best. See phab:T292811.
Line 1: Line 1:
{{notice | wmflabs.org and wmcloud.org redirect here. You might be looking for the [[:Category:Toolforge tools|Lists of Toolforge tools]] or the [[toolforge:openstack-browser/project/|List of Cloud VPS projects]].}}
{{See|wmflabs.org and wmcloud.org redirect here. You might be looking for the [[:Category:Toolforge tools|Lists of Toolforge tools]] or the [[toolforge:openstack-browser/project/|List of Cloud VPS projects]].}}


[[File:What is Cloud Services? poster.pdf|thumb|upright=1.3|Poster-format overview]]
[[File:What is Cloud Services? poster.pdf|thumb|upright=1.3|Poster-format overview]]


''Wikimedia Cloud Services (WMCS)'' provides tools, services, and support for technical collaborators who want to contribute to Wikimedia software projects. Use Cloud Services to host your software tools for the [[meta:Wikimedia movement|Wikimedia movement]], without charge.
== WMCS project overview ==


== Types of services ==
===What is Wikimedia Cloud Services (WMCS)?===


{{ContentGrid
'''Wikimedia Cloud Services''' ('''WMCS''') provides tools, services, and support for technical collaborators who want to contribute to Wikimedia software projects. WMCS is a computing ecosystem built on [[W:OpenStack|OpenStack]], [[W:Oracle Grid Engine|GridEngine]], and [[W:Kubernetes|Kubernetes]].
|content=
{{Colored box
|title = Data as a service
|content = [[#Quarry|Quarry]] and [[#PAWS|PAWS]] empower '''technically curious to advanced users''' to query wiki replicas and create scripts, tutorials, and data visualizations to analyze and improve Wikimedia projects.


See [[#Data Services]] below.
WMCS products and services are available for use by anyone connected with the [[meta:Wikimedia movement|Wikimedia movement]] without charge. Support and administration of the WMCS resources is provided by a [[Mw:Wikimedia Cloud Services team|Wikimedia Foundation Cloud Services team]] and [[wmf:Volunteer_opportunities|Wikimedia movement volunteers]]. We maintain a [[Help:Glossary|Glossary]] of related terminology.
}}
{{Colored box
|title = Platform as a service
|content = [[#Toolforge|Toolforge]] is for '''intermediate to advanced users''' working on tools, bots, webservices that support Wikimedia projects.


See [[#Toolforge]] below.
🎬 '''Video''': [https://media.ccc.de/v/36c3-77-wikimedia-cloud-services-introduction Wikimedia Cloud Services introduction] (2019)
}}
{{Colored box
|title = Infrastructure as a service
|content = [[#Cloud_VPS|Cloud VPS]] is for '''advanced users''' who need to administer their own servers for Wikimedia operations and software development.


See [[#Cloud VPS]] below.
πŸ“£ '''Slides''': [[commons:File:Introduction_to_Wikimedia_Cloud_Services_-_Wikimania_Hackathon_2019_Stockholm_Sweden.pdf|An introduction to Cloud Services presentation]] (2019)
}}
}}


Are you unsure? Check [[#Which service is right for you?]] below.
=== WMCS history ===


== Toolforge ==
From 2011 until early 2017, WMCS was known as "Wikimedia Labs." The term 'Labs' was used to refer to a number of different [[Labs labs labs|components]], and clarification was required. In 2017, the project was reorganized. The former Wikimedia Foundation Labs team and the Tool Labs Support team joined together to create the Wikimedia Cloud Services team.


[[File:Toolforge_logo.svg|right|frameless|60px|alt=Toolforge|link=Portal:Toolforge]]
{{anchor|WMCS Products}}


''[[Portal:Toolforge|Toolforge]]'' is one of the projects hosted by Wikimedia Cloud VPS. It is a shared hosting (platform as a service) environment for volunteers to develop and run [https://admin.toolforge.org/tools tools], [[mw:Manual:Creating a bot|continuous bots]], web services, scheduled jobs, and data analysis.
== WMCS products and services ==


To use Toolforge you will need some programming knowledge, an understanding of Unix command line, and version control via Gerrit and Git.
{{anchor|What product should I use?}}
{| class="wikitable sortable"
|+ WMCS Products
! Service
! Product
! Description
! Use
! Support Level
|-
| [[:en:Virtual_private_server|VPS]]
| [[Portal:Cloud VPS|Cloud VPS]]
| Provides collaboratively owned collections of virtual private servers where users develop and maintain software projects that help the Wikimedia movement.
| Use this to run full virtual instances.
| You are willing to administer instances on your own. We can provide quota to do so.  
|-
| [[:en:Platform as a service|PaaS]]
| [[Portal:Toolforge|Toolforge]]
| Provides a shared hosting/platform-as-a-service environment for running bots, webservices, scheduled jobs, and data analysis.
| Run a specific webservice, scheduled job, or perform analysis.
| You do not want to or are not able to manage a full virtual environment.
|-
| [[:en:Data_as_a_service|DaaS]]
| [[Portal:Data Services|Data Services]]
| A collection of products including private-information-redacted copies of Wikimedia's production wiki databases and access to [[Dumps.wikimedia.org|Wikimedia Dumps]].
| Create replicas of the production databases and other data for analysis and experimentation.  
| The [[quarry:|Quarry service]] provides database access via a web interface. Some DaaS resources may need to be requested for specific VPS projects.
|}


Users of the Toolforge project create so-called "tool" accounts (technically ''service groups'') which allow one or more users to collaborate to manage the software source code, configuration, and jobs for that tool or bot.
=== Renaming of products and services ===
We are in the process of changing the [[phab:phame/post/view/59/labs_and_tool_labs_being_renamed/| language and branding]] of the products and services we offer. You may find some outdated titles and names in WMCS documentation. Edits are welcome!


The Toolforge administrators manage a pool of virtual servers that provide a shared project hosting environment that can be used by Toolforge users. These resources include [[Help:Toolforge/Web|web servers]], [[Help:Toolforge/Database|databases]] and [[Help:Toolforge#Redis|other data storage]], and a [[Help:Toolforge/Grid|distributed job processing system]]. These services provide a reliable and scalable hosting environment for volunteers to develop and operate their tools and bots.
== Participating with WMCS ==


For additional documentation and help with Toolforge, see [[Portal:Toolforge]].
=== Sign up for services ===


== Cloud VPS ==
To access and contribute to Cloud Services projects and tools, you will need the following accounts:


[[File:Wikimedia_Cloud_Services_logo.svg|right|frameless|60px|alt=Cloud Services|link=Portal:Cloud VPS]]
*'''Wikimedia account''' - this account is the single user login or SUL account you use to contribute to Wikipedia and its sister projects.
*[[Help:Create_a_Wikimedia_developer_account|'''Wikimedia developer account''']] - this account is used log into this wiki, Toolforge, Cloud VPS, Gerrit and other protected Wikimedia Services.
* '''[[mw:Gerrit|Gerrit]]''' - our code review system; where our repositories (repos) live. Note that while [[mw:Gerrit/GitHub|GitHub]] contains many of our public repos, you can only make pull requests for Cloud Services projects via Gerrit. Other Wiki projects may use GitHub exclusively.
* '''[[mw:Phabricator|Phabricator]]''' - our project management system; for opening tickets, suggesting features, and talking about our plans for the next quarter.
* '''[[mw:MediaWiki on IRC|IRC]]''' - live chat channels. We have several channels related to our cloud servers, but the main channel is {{IRC|wikimedia-cloud}}. Deployment also frequently uses {{IRC|wikimedia-serviceops}}.
* The ability to [[Help:Accessing Cloud VPS instances|access instances]] in the WMCS environment.


''[[Portal:Cloud VPS|Cloud VPS]]'' (Virtual Private Server) is a [[:w:Cloud computing|cloud computing]] environment powered by [[:w:OpenStack|OpenStack]]. It offers collaboratively owned collections of virtual private servers. You can use this infrastructure to create and maintain open source software projects that help the [[:meta:Wikimedia movement|Wikimedia movement]].
===Review the terms and conditions===


The environment includes access to a variety of data services. Cloud VPS allows developers and system administrators to try out improvements to Wikimedia infrastructure (including MediaWiki), power research and analytics, and host projects that are not viable in the Toolforge environment.
Second, make sure to review and agree to our terms and conditions. [[Help:Terminology|Account holders]] who plan to use WMCS resources and products must read and agree to the following:


Cloud VPS is for the advanced users to get involved in Wikimedia operations and software development. Cloud VPS contains [https://openstack-browser.toolforge.org/project/ many projects], each of which uses one or more instances.
* [[Wikitech:Cloud Services Terms of use|Wikimedia Cloud Services Terms of Use]]
* [[mw:Code of Conduct|Code of Conduct for technical spaces]]
* [[mw:Wikimedia Labs/Agreement to disclosure of personally identifiable information|Agreement to disclosure of personally identifiable information]] (covers [[Help:Terminology|End-Users]]).


Cloud VPS instances must go through a request and approval processes. Instances are not permanent and are reviewed periodically for potential deletion/removal. Cloud VPS instances are resource intensive. Before requesting, explore whether Toolforge or another service will adequately meet your needs.
Please pay close attention to the following terms for '''Toolforge and Cloud VPS:'''


=== How is Cloud VPS organized? ===
* Toolforge tools must be [[w:Open-source software|open source software]] licensed under an [https://opensource.org/licenses OSI approved license].

* Toolforge and Cloud VPS projects must not collect, store, or share private data or personally identifiable information, such as user names, passwords, or IP addresses, except when complying with the conditions listed in the [[Wikitech:Cloud Services Terms of use|Wikimedia Cloud Services Terms of Use]].
Cloud VPS is divided into projects. Each project has separate members and administrators who can create and maintain virtual machines ("instances") for use by that project. Each project can have own its own access policies, DNS records, etc.

=== What is a Cloud VPS project? ===

A project is a unit of privilege separation inside the Cloud VPS environment. Each project has separate management of membership, virtual machines, HTTPS proxies, firewall rules, etc. Examples of projects include [[Portal:Toolforge|Toolforge]] and the [[Nova_Resource:Deployment-prep|Beta Cluster]].

===How does Cloud VPS work? ===

Cloud VPS is a virtualization cluster and hosts various virtual machines (called instances) using [http://www.openstack.org/software/openstack-compute OpenStack Compute]. This is slightly different from your normal servers that you ssh to (i.e. Toolserver), as virtual machines do not exist physically, but reside inside a much bigger machine called the host machine. More details about the physical setup of Cloud VPS can be found under [[Portal:Cloud VPS/Infrastructure]].

== What is the difference between Cloud VPS and Toolforge? ==

Cloud VPS is an [[:w:Cloud_computing#Infrastructure_as_a_service_.28IaaS.29|Infrastructure as a service (IaaS)]] solution. It provides virtual machines, storage, firewall, and HTTPS proxy resources to projects. The members of each individual project are responsible for managing applications, data, runtime, middleware, and operating systems themselves.

Toolforge is a [[:w:Cloud_computing#Platform_as_a_service_.28PaaS.29|Platform as a service (PaaS)]] solution. It provides [[Help:Toolforge/Web|web servers]], [[Help:Toolforge/Database|databases]] and [[Help:Toolforge#Redis|other data storage]], and a [[Help:Toolforge/Grid|distributed job processing system]] as managed services that can be used by tools and their maintainers.

== Data Services ==

''[[Portal:Data Services|Data Services]]'' are a collection of products including private-information-redacted copies of Wikimedia's production wiki databases and access to [[dumps.wikimedia.org|Wikimedia Dumps]]. Use data services to create replicas of the production databases and other data for analysis and experimentation.

There are also services to interact with data in a web browser: Quarry and PAWS.

=== Quarry ===

[[File:Quarry-logo.svg|right|frameless|200px|alt=Quarry|link=meta:Research:Quarry]]

''Quarry'' is a public querying interface for [[Help:Toolforge/Database|Wiki Replicas]], a set of live replica SQL databases of public Wikimedia Wikis. Quarry is designed to make running queries against Wiki Replicas easy. Quarry also provides a means for researchers to share and review each other's queries.

Quarry queries are run by individual users. They can be saved and published and forked by other users.

To use Quarry you need only a Wikimedia login and a web browsers. Quarry can be used by individuals with understanding along the technical spectrum. A basic understanding of SQL is recommended. Learn about [[Help:MySQL_queries|SQL queries]].

=== PAWS ===

[[File:PAWS.svg|right|frameless|200px|alt=PAWS|link=PAWS]]

''PAWS'' is a Jupyter notebook installation hosted by Wikimedia. PAWS notebooks can be used for creating tutorials, running live code, creating data visualizations, running bots using Pywikibot, and more.

PAWS notebooks are maintained by a single user. They can be downloaded and forked by other users.

To use PAWS you need only a Wikimedia login and a web browser. PAWS can be used by individuals with understanding along the technical spectrum. A knowledge of Python is helpful, but not required.

== Which service is right for you? ==

{| class="wikitable"
|+
!Activity / Needs
!Quarry (DaaS)
!PAWS (DaaS)
!Toolforge (PaaS)
!Cloud VPS (IaaS)
|-
|Browser based
|βœ”
|βœ”
|
|
|-
|Terminal based
|
|
|βœ”
|βœ”
|-
|Write queries against replica databases
|βœ”
|βœ”
|βœ”
|
|-
|Run database dumps
|
|βœ”
|βœ”
|
|-
|Write and run bots
|
|βœ”
|βœ”
|
|-
|Run web services
|
|
|βœ”
|
|-
|Build tools to improve Wikimedia projects
|
|
|βœ”
|
|-
|Schedule or run continuous jobs
|
|
|βœ”
|
|-
|Administer your own virtual server
|
|
|
|βœ”
|-
|Need your own subdomain
|
|
|
|βœ”
|-
|Write documentation and create tutorials
|
|βœ”
|
|
|-
|Work with co-maintainers and co-admins
|
|
|βœ”
|βœ”
|-
!User knowledge
|curiousβ€”advanced
|curiousβ€”advanced
|intermediateβ€”advanced
|advanced
|-
!Service concept
|Data as a service
|Data as a service
|Platform as a service
|Infrastructure as a service
|}

== Get started ==

Set up your Toolforge or Cloud VPS projects by following the instructions on [[Help:Getting Started]].


{{:Help:Cloud Services communication}}
{{:Help:Cloud Services communication}}

== Technology stack==

WMCS is a computing ecosystem built on [[w:OpenStack|OpenStack]], [[w:Oracle Grid Engine|GridEngine]], and [[w:Kubernetes|Kubernetes]].
Cloud VPS projects use [[Help:Horizon FAQ|Horizon]].


== Learn more ==
== Learn more ==
* [[Portal:Toolforge|Toolforge Portal]] β€” Information about Toolforge and links to help and technical documentation.
* [[Portal:Cloud VPS|Cloud VPS Portal]] β€” Information about Cloud VPS and links to help and technical documentation.
* [[Portal:Data Services|Data Services Portal]] β€” Information about Data Services and links to help and technical documentation.
* See the [[Help:Glossary|Glossary]] for detailed definitions of terms which are specific to Toolforge and Cloud VPS.
* 🎬 Video: [https://media.ccc.de/v/36c3-77-wikimedia-cloud-services-introduction Wikimedia Cloud Services introduction] (2019)
* πŸ“£ Slides: [[commons:File:Introduction_to_Wikimedia_Cloud_Services_-_Wikimania_Hackathon_2019_Stockholm_Sweden.pdf|An introduction to Cloud Services presentation]] (2019)

== Historical information ==


From 2011 until early 2017, ''Wikimedia Cloud Services'' was known as ''Wikimedia Labs''. However, the term ''Labs'' was used for [[Labs labs labs|several different things]].
* [[Help:FAQ | Cloud VPS and Toolforge At-A-Glance]]: This page provides a basic introduction to '''Cloud VPS''' and '''Toolforge'''.
* [[Portal:Cloud_VPS | Cloud VPS Portal]]: Information about Cloud VPS and links to help and technical documentation.
* [[Portal:Toolforge | Toolforge Portal]]: Information about Toolforge and links to help and technical documentation.
* [[Portal:Data_Services | Data Services Portal]]: Information about Data Services and links to help and technical documentation.


Since 2017, the former ''Wikimedia Foundation Labs team'' and ''Tool Labs Support team'' merged into the ''[[mw:Wikimedia Cloud Services team|Wikimedia Cloud Services team]]''.
[[Category:Cloud Services]]

Revision as of 17:19, 8 March 2022

Poster-format overview

Wikimedia Cloud Services (WMCS) provides tools, services, and support for technical collaborators who want to contribute to Wikimedia software projects. Use Cloud Services to host your software tools for the Wikimedia movement, without charge.

Types of services

Data as a service

Quarry and PAWS empower technically curious to advanced users to query wiki replicas and create scripts, tutorials, and data visualizations to analyze and improve Wikimedia projects.

See #Data Services below.

Platform as a service

Toolforge is for intermediate to advanced users working on tools, bots, webservices that support Wikimedia projects.

See #Toolforge below.

Infrastructure as a service

Cloud VPS is for advanced users who need to administer their own servers for Wikimedia operations and software development.

See #Cloud VPS below.

Are you unsure? Check #Which service is right for you? below.

Toolforge

Toolforge

Toolforge is one of the projects hosted by Wikimedia Cloud VPS. It is a shared hosting (platform as a service) environment for volunteers to develop and run tools, continuous bots, web services, scheduled jobs, and data analysis.

To use Toolforge you will need some programming knowledge, an understanding of Unix command line, and version control via Gerrit and Git.

Users of the Toolforge project create so-called "tool" accounts (technically service groups) which allow one or more users to collaborate to manage the software source code, configuration, and jobs for that tool or bot.

The Toolforge administrators manage a pool of virtual servers that provide a shared project hosting environment that can be used by Toolforge users. These resources include web servers, databases and other data storage, and a distributed job processing system. These services provide a reliable and scalable hosting environment for volunteers to develop and operate their tools and bots.

For additional documentation and help with Toolforge, see Portal:Toolforge.

Cloud VPS

Cloud Services

Cloud VPS (Virtual Private Server) is a cloud computing environment powered by OpenStack. It offers collaboratively owned collections of virtual private servers. You can use this infrastructure to create and maintain open source software projects that help the Wikimedia movement.

The environment includes access to a variety of data services. Cloud VPS allows developers and system administrators to try out improvements to Wikimedia infrastructure (including MediaWiki), power research and analytics, and host projects that are not viable in the Toolforge environment.

Cloud VPS is for the advanced users to get involved in Wikimedia operations and software development. Cloud VPS contains many projects, each of which uses one or more instances.

Cloud VPS instances must go through a request and approval processes. Instances are not permanent and are reviewed periodically for potential deletion/removal. Cloud VPS instances are resource intensive. Before requesting, explore whether Toolforge or another service will adequately meet your needs.

How is Cloud VPS organized?

Cloud VPS is divided into projects. Each project has separate members and administrators who can create and maintain virtual machines ("instances") for use by that project. Each project can have own its own access policies, DNS records, etc.

What is a Cloud VPS project?

A project is a unit of privilege separation inside the Cloud VPS environment. Each project has separate management of membership, virtual machines, HTTPS proxies, firewall rules, etc. Examples of projects include Toolforge and the Beta Cluster.

How does Cloud VPS work?

Cloud VPS is a virtualization cluster and hosts various virtual machines (called instances) using OpenStack Compute. This is slightly different from your normal servers that you ssh to (i.e. Toolserver), as virtual machines do not exist physically, but reside inside a much bigger machine called the host machine. More details about the physical setup of Cloud VPS can be found under Portal:Cloud VPS/Infrastructure.

What is the difference between Cloud VPS and Toolforge?

Cloud VPS is an Infrastructure as a service (IaaS) solution. It provides virtual machines, storage, firewall, and HTTPS proxy resources to projects. The members of each individual project are responsible for managing applications, data, runtime, middleware, and operating systems themselves.

Toolforge is a Platform as a service (PaaS) solution. It provides web servers, databases and other data storage, and a distributed job processing system as managed services that can be used by tools and their maintainers.

Data Services

Data Services are a collection of products including private-information-redacted copies of Wikimedia's production wiki databases and access to Wikimedia Dumps. Use data services to create replicas of the production databases and other data for analysis and experimentation.

There are also services to interact with data in a web browser: Quarry and PAWS.

Quarry

Quarry

Quarry is a public querying interface for Wiki Replicas, a set of live replica SQL databases of public Wikimedia Wikis. Quarry is designed to make running queries against Wiki Replicas easy. Quarry also provides a means for researchers to share and review each other's queries.

Quarry queries are run by individual users. They can be saved and published and forked by other users.

To use Quarry you need only a Wikimedia login and a web browsers. Quarry can be used by individuals with understanding along the technical spectrum. A basic understanding of SQL is recommended. Learn about SQL queries.

PAWS

PAWS

PAWS is a Jupyter notebook installation hosted by Wikimedia. PAWS notebooks can be used for creating tutorials, running live code, creating data visualizations, running bots using Pywikibot, and more.

PAWS notebooks are maintained by a single user. They can be downloaded and forked by other users.

To use PAWS you need only a Wikimedia login and a web browser. PAWS can be used by individuals with understanding along the technical spectrum. A knowledge of Python is helpful, but not required.

Which service is right for you?

Activity / Needs Quarry (DaaS) PAWS (DaaS) Toolforge (PaaS) Cloud VPS (IaaS)
Browser based βœ” βœ”
Terminal based βœ” βœ”
Write queries against replica databases βœ” βœ” βœ”
Run database dumps βœ” βœ”
Write and run bots βœ” βœ”
Run web services βœ”
Build tools to improve Wikimedia projects βœ”
Schedule or run continuous jobs βœ”
Administer your own virtual server βœ”
Need your own subdomain βœ”
Write documentation and create tutorials βœ”
Work with co-maintainers and co-admins βœ” βœ”
User knowledge curiousβ€”advanced curiousβ€”advanced intermediateβ€”advanced advanced
Service concept Data as a service Data as a service Platform as a service Infrastructure as a service

Get started

Set up your Toolforge or Cloud VPS projects by following the instructions on Help:Getting Started.

Communication and support

Support and administration of the WMCS resources is provided by the Wikimedia Foundation Cloud Services team and Wikimedia movement volunteers. Please reach out with questions and join the conversation:

Discuss and receive general support
Stay aware of critical changes and plans
Track work tasks and report bugs

Use a subproject of the #Cloud-Services Phabricator project to track confirmed bug reports and feature requests about the Cloud Services infrastructure itself

Read stories and WMCS blog posts

Read the Cloud Services Blog (for the broader Wikimedia movement, see the Wikimedia Technical Blog)

Technology stack

WMCS is a computing ecosystem built on OpenStack, GridEngine, and Kubernetes. Cloud VPS projects use Horizon.

Learn more

Historical information

From 2011 until early 2017, Wikimedia Cloud Services was known as Wikimedia Labs. However, the term Labs was used for several different things.

Since 2017, the former Wikimedia Foundation Labs team and Tool Labs Support team merged into the Wikimedia Cloud Services team.