doc/user/project/import/github.md
{{< details >}}
{{< /details >}}
{{< history >}}
{{< /history >}}
You can import your GitHub projects from either GitHub.com or GitHub Enterprise. Importing projects does not migrate or import any types of groups or organizations from GitHub to GitLab.
Imported issues, merge requests, comments, and events have an Imported badge in GitLab.
The namespace is a user or group in GitLab, such as gitlab.com/sidney-jones or
gitlab.com/customer-success.
Using the GitLab UI, the GitHub importer always imports from the
github.com domain. If you are importing from a self-hosted GitHub Enterprise Server domain, use the
import API GitHub endpoint with a GitLab access token with the api scope.
You can change the target namespace and target repository name before you import.
<i class="fa-youtube-play" aria-hidden="true"></i> For an overview of the import process, see how to migrate from GitHub to GitLab including actions.
Every import from GitHub is different, which affects the duration of imports you perform. However, in testing, GitLab
imported https://github.com/kubernetes/kubernetes in 76 hours. Tests showed that the project comprised:
To import projects from GitHub, you must enable the GitHub import source. If that import source is not enabled, ask your GitLab administrator to enable it. The GitHub import source is enabled by default on GitLab.com.
To use the GitHub importer, you must have:
Also, the organization the GitHub repository belongs to must not impose restrictions of a third-party application access policy on the GitLab instance you import to.
{{< history >}}
{{< /history >}}
GitHub pull request comments (known as diff notes in GitLab) created before 2017 are imported in separate threads.
This occurs because of a limitation of the GitHub API that doesn't include in_reply_to_id for comments before 2017.
In GitLab 18.3 and earlier, Markdown attachments from repositories on GitHub Enterprise Server instances are not imported. In GitLab 18.4 and later:
Because of a known issue, when importing projects that used
GitHub auto-merge, the imported project in GitLab can have merge commits labeled unverified if the commit was signed with the GitHub internal GPG key.
GitLab can't import GitHub Markdown image attachments that were uploaded to private repositories before 2023-05-09. If you encounter this problem and are willing to provide a sample repository, add a comment to issue 424046 and GitLab will contact you.
For GitLab-specific references, GitLab uses the # character for issues and a ! character for merge requests.
However, GitHub uses only the # character for both issues and pull requests. When importing:
When importing from GitHub accounts with SAML single sign-on (SSO) enabled, Markdown attachments might fail to import. This issue is caused by a GitHub API limitation where assets cannot be downloaded using a personal access token when SSO is enforced. To workaround the issue, add the GitLab user performing the import as an outside collaborator to the GitHub repository. This permits access to private attachments during import.
You can import your GitHub repository by either:
If importing from github.com you can use any method to import. Self-hosted GitHub Enterprise Server customers must use the API.
If you are importing to GitLab.com or to a GitLab Self-Managed that has GitHub OAuth configured, you can use GitHub OAuth to import your repository.
This method has an advantage over using a personal access token (PAT) because the backend exchanges the access token with the appropriate permissions.
To use a different method to perform an import after previously performing these steps, sign out of your GitLab account and sign in again.
To import your GitHub repository using a GitHub personal access token:
repo scope.read:org scope.To use a different token to perform an import after previously performing these steps, sign out of your GitLab account and sign in again, or revoke the older token in GitHub.
The import API can be used to import a GitHub repository. It has some advantages over using the GitLab UI:
timeout_strategy option that is not available to the UI.The REST API is limited to authenticating with GitLab personal access tokens.
To import your GitHub repository using the GitLab REST API:
repo scope.read:org scope.{{< history >}}
{{< /history >}}
After you authorize access to your GitHub repositories, GitLab redirects you to the importer page and your GitHub repositories are listed.
Use one of the following tabs to filter the list of repositories:
When the Organization tab is selected, you can further narrow down your search by selecting an available GitHub organization from a dropdown list.
{{< history >}}
github_import_extended_events. Disabled by default.github_import_extended_events removed.{{< /history >}}
To make imports as fast as possible, the following items aren't imported from GitHub by default:
You can choose to import these items, but this could significantly increase import time. To import these items, select the appropriate fields in the UI:
By default, the proposed repository namespaces match the names as they exist in GitHub, but based on your permissions, you can choose to edit these names before you proceed to import any of them.
To select which repositories to import, next to any number of repositories select Import or select Import all repositories.
Additionally, you can filter projects by name. If a filter is applied, Import all repositories only imports matched repositories.
The Status column shows the import status of each repository. You can choose to keep the page open and watch updates in real time or you can return to it later.
To cancel imports that are pending or in progress, next to the imported project, select Cancel. If the import has already started, the imported files are kept.
To open a repository in GitLab URL after it has been imported, select its GitLab path.
Completed imports can be re-imported by selecting Re-import and specifying new name. This creates a new copy of the source project.
{{< history >}}
{{< /history >}}
After imports are completed, they can be in one of three states:
Expand Details to see a list of repository entities that failed to import.
{{< history >}}
{{< /history >}}
GitLab adds backticks to username mentions in issues, merge requests, and notes. These backticks prevent linking to an incorrect user with the same username on the GitLab instance.
{{< history >}}
{{< /history >}}
The GitHub importer uses a post-migration method of mapping user contributions for GitLab.com, GitLab Self-Managed, and GitLab Dedicated.
In GitLab 18.7 and earlier, you can disable the github_user_mapping feature flag to use the alternative user
contribution mapping method for imports.
[!flag] The availability of this feature is controlled by a feature flag. This feature is not recommended and is unavailable for:
- Migrations to GitLab.com.
- Migrations to GitLab Self-Managed and GitLab Dedicated 18.8 and later.
Problems that are found in this mapping method are unlikely to be fixed. Use the post-migration method instead that doesn't have these limitations.
For more information, see issue 510963.
Requirements:
Using this method, when user accounts are provisioned correctly, users are mapped during the import.
If the requirements are not met, the importer can't map the particular user's contributions. In this case:
{{< details >}}
{{< /details >}}
Depending on your GitLab tier, repository mirroring can be set up to keep your imported repository in sync with its GitHub copy.
Additionally, you can configure GitLab to send pipeline status updates back to GitHub with the GitHub Project Integration.
If you import your project using CI/CD for external repository, then both features are automatically configured.
[!note] Mirroring does not sync any new or updated pull requests from your GitHub project.
Administrator access on the GitLab server is required for these steps.
For large projects it may take a while to import all data. To reduce the time necessary, you can increase the number of Sidekiq workers that process the following queues:
github_importergithub_importer_advance_stageFor an optimal experience, it's recommended having at least 4 Sidekiq processes (each running a number of threads equal to the number of CPU cores) that only process these queues. It's also recommended that these processes run on separate servers. For 4 servers with 8 cores this means you can import up to 32 objects (for example, issues) in parallel.
Reducing the time spent in cloning a repository can be done by increasing network throughput, CPU capacity, and disk performance (by using high performance SSDs, for example) of the disks that store the Git repositories (for your GitLab instance). Increasing the number of Sidekiq workers does not reduce the time spent cloning repositories.
If you belong to a GitHub Enterprise Cloud organization you can configure GitLab Self-Managed to obtain a higher GitHub API rate limit.
GitHub API requests are usually subject to a rate limit of 5,000 requests per hour. Using the steps below, you obtain a higher 15,000 requests per hour rate limit, resulting in a faster overall import time.
Prerequisites:
To enable a higher rate limit:
The following items of a project are imported:
All fork branches of the project related to open pull requests
[!note] Fork branches are imported with a naming scheme similar to
GH-SHA-username/pull-request-number/fork-name/branch.
All project branches
Attachments for:
Branch protection rules
Git repository data
Issue and pull request comments
Issue and pull request events (can be imported as an additional item)
Issues
Labels
Milestones
Pull request assigned reviewers
Pull request merged by information
Pull request reviews
Pull request review comments
Pull request review replies to discussions
Pull request review suggestions
Pull requests
Release notes content
Repository descriptions
Wiki pages
References to pull requests and issues are preserved. Each imported repository maintains visibility level unless that visibility level is restricted, in which case it defaults to the default project visibility.
Imported GitHub branch protection rules are mapped to one of the following:
| GitHub rule | GitLab rule |
|---|---|
| Require conversation resolution before merging for the project's default branch | All threads must be resolved project setting |
| Require a pull request before merging | No one option in the Allowed to push and merge branch protection settings |
| Require signed commits for the project's default branch | Reject unsigned commits GitLab push rule |
| Allow force pushes - Everyone | Allowed to force push branch protection setting |
| Require a pull request before merging - Require review from Code Owners | Require approval from code owners branch protection setting |
| Require a pull request before merging - Allow specified actors to bypass required pull requests | List of users in the Allowed to push and merge branch protection settings. Without a GitLab Premium subscription, the list of users that are allowed to push and merge is limited to roles. |
The Require status checks to pass before merging GitHub rule is not imported. You can still create external status checks manually. For more information, see issue 370948.
{{< history >}}
{{< /history >}}
These GitHub collaborator roles are mapped to these GitLab member roles:
| GitHub role | Mapped GitLab role |
|---|---|
| Read | Guest |
| Triage | Reporter |
| Write | Developer |
| Maintain | Maintainer |
| Admin | Owner |
GitHub Enterprise Cloud has custom repository roles. These roles aren't supported and cause partially completed imports.
To import GitHub collaborators, you must have the Write or Maintain role on the GitHub project. Otherwise collaborators import is skipped.
If your GitHub Enterprise instance is on a internal network that is inaccessible to the internet, you can use a reverse proxy to allow GitLab.com to access the instance.
The proxy needs to:
Link header.GitHub API uses the Link header for pagination.
After configuring the proxy, test it by making API requests. Below there are some examples of commands to test the API:
curl --header "Authorization: Bearer <YOUR-TOKEN>" "https://{PROXY_HOSTNAME}/user"
### URLs in the response body should use the proxy hostname
{
"login": "example_username",
"id": 1,
"url": "https://{PROXY_HOSTNAME}/users/example_username",
"html_url": "https://{PROXY_HOSTNAME}/example_username",
"followers_url": "https://{PROXY_HOSTNAME}/api/v3/users/example_username/followers",
...
"created_at": "2014-02-11T17:03:25Z",
"updated_at": "2022-10-18T14:36:27Z"
}
curl --head --header "Authorization: Bearer <YOUR-TOKEN>" "https://{PROXY_DOMAIN}/api/v3/repos/{repository_path}/pulls?states=all&sort=created&direction=asc"
### Link header should use the proxy hostname
HTTP/1.1 200 OK
Date: Tue, 18 Oct 2022 21:42:55 GMT
Server: GitHub.com
Content-Type: application/json; charset=utf-8
Cache-Control: private, max-age=60, s-maxage=60
...
X-OAuth-Scopes: repo
X-Accepted-OAuth-Scopes:
github-authentication-token-expiration: 2022-11-22 18:13:46 UTC
X-GitHub-Media-Type: github.v3; format=json
X-RateLimit-Limit: 5000
X-RateLimit-Remaining: 4997
X-RateLimit-Reset: 1666132381
X-RateLimit-Used: 3
X-RateLimit-Resource: core
Link: <https://{PROXY_DOMAIN}/api/v3/repositories/1/pulls?page=2>; rel="next", <https://{PROXY_DOMAIN}/api/v3/repositories/1/pulls?page=11>; rel="last"
Also test that cloning the repository using the proxy does not fail:
git clone -c http.extraHeader="Authorization: basic <base64 encode YOUR-TOKEN>" --mirror https://{PROXY_DOMAIN}/{REPOSITORY_PATH}.git
The following configuration is an example on how to configure Apache HTTP Server as a reverse proxy
[!warning] For simplicity, the snippet does not have configuration to encrypt the connection between the client and the proxy. However, for security reasons you should include that configuration. See sample Apache TLS/SSL configuration.
# Required modules
LoadModule filter_module lib/httpd/modules/mod_filter.so
LoadModule reflector_module lib/httpd/modules/mod_reflector.so
LoadModule substitute_module lib/httpd/modules/mod_substitute.so
LoadModule deflate_module lib/httpd/modules/mod_deflate.so
LoadModule headers_module lib/httpd/modules/mod_headers.so
LoadModule proxy_module lib/httpd/modules/mod_proxy.so
LoadModule proxy_connect_module lib/httpd/modules/mod_proxy_connect.so
LoadModule proxy_http_module lib/httpd/modules/mod_proxy_http.so
LoadModule ssl_module lib/httpd/modules/mod_ssl.so
<VirtualHost GITHUB_ENTERPRISE_HOSTNAME:80>
ServerName GITHUB_ENTERPRISE_HOSTNAME
# Enables reverse-proxy configuration with SSL support
SSLProxyEngine On
ProxyPass "/" "https://GITHUB_ENTERPRISE_HOSTNAME/"
ProxyPassReverse "/" "https://GITHUB_ENTERPRISE_HOSTNAME/"
# Replaces occurrences of the local GitHub Enterprise URL with the Proxy URL
# GitHub Enterprise compresses the responses, the filters INFLATE and DEFLATE needs to be used to
# decompress and compress the response back
AddOutputFilterByType INFLATE;SUBSTITUTE;DEFLATE application/json
Substitute "s|https://GITHUB_ENTERPRISE_HOSTNAME|https://PROXY_HOSTNAME|ni"
SubstituteMaxLineLength 50M
# GitHub API uses the response header "Link" for the API pagination
# For example:
# <https://example.com/api/v3/repositories/1/issues?page=2>; rel="next", <https://example.com/api/v3/repositories/1/issues?page=3>; rel="last"
# The directive below replaces all occurrences of the GitHub Enterprise URL with the Proxy URL if the
# response header Link is present
Header edit* Link "https://GITHUB_ENTERPRISE_HOSTNAME" "https://PROXY_HOSTNAME"
</VirtualHost>