← Blog

Migration

CiviCRM to Salesforce: 23 field mappings that bite quietly

A Dutch healthcare association, 31,000 members, fifteen years of CiviCRM on Drupal 7, a Salesforce org that quietly chopped emails on import. The field map we wish we had.

Jacob Molkenboer· Founder · A Brand New Company· 7 Jun 2026· 9 min
Open leather logbook with brass key, green index card, twine shipping tag, wax-sealed envelope on ivory paper.

A success log that lied

Utrecht, 23:14 on a Wednesday. The migration consultant runs Salesforce Data Loader, watches the success counter tick up to 31,847. Zero errors. He closes the laptop. The next morning the membership team forwards seventeen "address unknown" replies from their newsletter platform, then forty more by lunch. The problem is not import failure. The problem is that info+facturatie@vereniging-langnaam-zorg.nl got silently chopped to info+facturatie@vereniging-langn the moment it hit the standard Contact.Email field, then propagated to every workflow downstream.

This is the migration we did last year for a healthcare association we will call the Vereniging. 31,000 members, fifteen years of data, CiviCRM 4.7 on Drupal 7. The brief was straightforward: land on Salesforce NPSP before Drupal 7 reached community end of life, keep the membership tiers, keep the historical contributions, do not lose a single relationship link. The schema mapping was not straightforward. What follows is the cheatsheet we wish we had had on day one.

Where the silent truncations live

CiviCRM's data model is a stack of well-normalized tables: civicrm_contact, civicrm_email, civicrm_phone, civicrm_address, civicrm_membership, joined on contact_id. Salesforce wants one fat row per Contact, with email and phone denormalized as primary fields and the rest spread across npe01__OppPayment__c, npe03__Recurring_Donation__c, and the custom objects shipped by NPSP.

The mismatch in shape is not the issue. Every ETL person can pivot rows. The issue is that on the way through, three categories of error happen without anyone telling you.

  1. String overruns. CiviCRM lets a field hold 254 chars. The Salesforce equivalent holds 80. Data Loader truncates and keeps going.
  2. Picklist drift. CiviCRM's OptionValue table has values like Lid - actief. The Salesforce picklist has Active member. With strict picklist enforcement off, the field is set to null without an error.
  3. Lookup misses. NPSP relationships need a target Contact ID, not a name. If the lookup fails, the relationship is dropped, but the parent record imports fine.

The cheatsheet below is the field map for the Vereniging migration, ordered roughly by how much data we lost when we did not catch the issue ahead of the bulk job. Severity is our own scale: S1 means data lost without trace, S2 means data lost but the row reported as failed, S3 means data preserved but transformed in ways that broke downstream workflows.

Tier S1, the silent truncators

  1. Email (primary). civicrm_email.email is varchar(254), per RFC 5321. The standard Salesforce Contact.Email is 80. We lost 412 addresses on first pass. Long Dutch domains with plus-aliases die first.
  2. Display name. civicrm_contact.display_name is 128. SF Name is computed from FirstName (40) + LastName (80). Anyone with a tussenvoegsel (van der, de) and a hyphenated surname overflows.
  3. Contact source. civicrm_contact.source is 255. SF LeadSource picklist values cap at 40 chars. Map free text to a picklist with strict mode off and every unmatched value becomes null.
  4. Notes. CiviCRM stores notes as rich HTML in civicrm_note.note. The NPSP ContentNote path strips tags and silently caps content at 128 KB. Long member-history notes lose paragraph breaks and then anything past the cap.
  5. Salutation. CiviCRM has free-text prefix (varchar 64). SF Salutation is a restricted picklist. Prof. dr. does not match Prof.
  6. Custom long text. NPSP custom long-text fields default to 32,768 chars. Anything longer is cut at the API layer with no warning in the bulk job summary.

Tier S2, the type clashes that fail the row

  1. Birth date. CiviCRM stores YYYY-MM-DD as DATE. Salesforce expects the same format, but any record with the legacy 0000-00-00 sentinel (MySQL strict mode off) is rejected.
  2. is_deceased. CiviCRM stores 1 and 0 as tinyint. SF expects TRUE or FALSE. Data Loader's CSV path accepts both. Bulk API v2 only accepts the latter.
  3. Gender. civicrm_contact.gender_id is a foreign key to civicrm_option_value. NPSP npsp__Gender__c is a picklist. Three Vereniging members had a custom Anders value not in NPSP's default set.
  4. Phone. CiviCRM allows arbitrary formatting. SF Phone is 40 chars and accepts anything on insert, but the validation rules NPSP ships with reject non-E.164 input on later update.
  5. Address country. CiviCRM's civicrm_country.iso_code is ISO 3166 alpha-2. If SF has State and Country picklists enabled (default for new orgs), NL matches, but historical Dutch overseas addresses tagged AN (Netherlands Antilles, retired in 2010) fail.
  6. Currency. Multi-currency orgs need CurrencyIsoCode on every row. Vereniging had thirty-two CHF contributions from a Swiss sister org. Without the column, they imported as EUR.
  7. Decimal precision. SF Number(16,2) accepts up to 18 digits but truncates to 2 decimals silently. Civi's civicrm_contribution.total_amount is decimal(20,2). Three-decimal values round without a row error.

Tier S3, the transforms that break downstream

  1. Owner. Civi has no record ownership. SF defaults to the running user. If your integration user has the wrong profile, every contact lands under that user and shows up in nobody's list views.
  2. RecordType. NPSP uses Household_Account and Organization. Without explicit RecordTypeId, every contact lands under the default. Bulk re-assignment after the fact triggers the duplicate rules.
  3. Tags. civicrm_tag is many-to-many via civicrm_entity_tag. SF has Topics, or you build a junction object. The multi-select picklist semicolon delimiter is a trap: comma-separated input gets stored as one value.
  4. Groups. CiviCRM smart groups (dynamic) have no Salesforce equivalent. Static groups become Campaigns or Public Groups depending on whether they have membership semantics or sharing semantics.
  5. Relationships. NPSP npsp__Relationship__c needs both Contacts to already exist. If you import in one pass, half the relationships drop. Two-pass import: Contacts first, relationships second, anchored on external ID.
  6. Memberships. civicrm_membership has status, type, dates. NPSP does not ship membership natively; either install the Membership Management package or build custom objects. We built custom because the package's status engine did not match the Vereniging's tier rules.
  7. Contributions. civicrm_contribution maps to Opportunity with StageName="Closed Won" for received payments. The default NPSP stage values are English. The Vereniging team works in Dutch. Either translate the picklist or accept English stages internally.
  8. Activities. civicrm_activity with activity_type_id maps to Task or Event. Civi can assign one activity to multiple contacts. SF needs one Task per assignee, so the row count grows on the way in.
  9. Email opt-out. civicrm_contact.is_opt_out (tinyint) maps to HasOptedOutOfEmail (boolean). The catch: the Civi field is the inverse of do_not_email, and the Vereniging team had inconsistently used both for a decade.
  10. IBAN. SF has no native IBAN type. We stored as Text(34) with a validation rule. Leading-zero IBANs (e.g. NL02ABNA...) get coerced to numbers by Excel during the CSV stop. Always export as CSV with quoted strings, never through Excel.
  11. External ID. Add a custom field CiviCRM_ID__c as External ID, Unique, Indexed on every imported object. This is the one mapping you cannot skip. Without it, the second-pass import for relationships, activities, and contributions has no anchor.

The SQL we ran to flatten Civi for export

SELECT
  c.id                            AS civicrm_id,
  c.contact_type,
  c.first_name,
  c.last_name,
  c.middle_name,
  c.prefix_id,
  c.gender_id,
  c.birth_date,
  c.is_deceased,
  c.do_not_email,
  c.source                        AS contact_source,
  e.email                         AS primary_email,
  LENGTH(e.email)                 AS email_len,
  p.phone                         AS primary_phone,
  a.street_address,
  a.city,
  a.postal_code,
  co.iso_code                     AS country_iso
FROM civicrm_contact c
LEFT JOIN civicrm_email   e  ON e.contact_id = c.id AND e.is_primary = 1
LEFT JOIN civicrm_phone   p  ON p.contact_id = c.id AND p.is_primary = 1
LEFT JOIN civicrm_address a  ON a.contact_id = c.id AND a.is_primary = 1
LEFT JOIN civicrm_country co ON co.id = a.country_id
WHERE c.is_deleted = 0
  AND c.contact_type = 'Individual';

The email_len column earned its keep. We sorted by it descending before the first import pass and found 412 rows above 80 characters. The Vereniging's policy: keep the long address in CiviCRM, create a forwarder, write the forwarder into the Salesforce field. No address was lost in the end. Without the length column, all 412 would have shipped truncated and we would have learned about it from bounce reports a week later.

Warning

The Salesforce Bulk API reports the row as successful even when a string field is truncated to fit. The only ways to detect it are a pre-flight length check against FieldDefinition, or a post-flight diff between source and target. Trust neither the success log nor the row count.

The pre-flight script that saved the second pass

Before each batch we ran a length audit against the SF schema, pulled once from the tooling/sobjects/Contact/describe endpoint and cached. If any source value exceeded its target length, the row was flagged to a review CSV instead of being shipped.

import csv

SF_LIMITS = {  # pulled once from describe(), cached locally
    "Email":             80,
    "FirstName":         40,
    "LastName":          80,
    "Phone":             40,
    "MailingStreet":    255,
    "MailingPostalCode": 20,
    "Salutation":        40,
    "LeadSource":        40,
}

with open("civi_export.csv")  as src, \
     open("ready.csv",  "w")  as ok, \
     open("review.csv", "w")  as bad:

    r     = csv.DictReader(src)
    w_ok  = csv.DictWriter(ok,  fieldnames=r.fieldnames)
    w_bad = csv.DictWriter(bad, fieldnames=r.fieldnames + ["overflow"])
    w_ok.writeheader(); w_bad.writeheader()

    for row in r:
        overflow = [
            f"{k}:{len(row[k])}>{SF_LIMITS[k]}"
            for k in SF_LIMITS
            if k in row and len(row[k]) > SF_LIMITS[k]
        ]
        if overflow:
            row["overflow"] = ", ".join(overflow)
            w_bad.writerow(row)
        else:
            w_ok.writerow(row)

Five minutes to write, three days saved on the back end. The review CSV went to the membership team, not to the developers; data owners decided what to shorten, what to forward, what to drop. That conversation is the migration. The bulk job is the boring afterthought.

The smallest thing to do today

If you are sitting on a CiviCRM stack and a Salesforce migration is on the roadmap, run the SQL above against your own database tonight. Sort by email_len DESC. Anything over 80 is your first conversation with the data owners, before you even think about ETL tooling. The 23-row cheatsheet is the next conversation, not the first.

When we built the Vereniging migration, the thing we kept running into was Tier S1: data lost without a log line. The fix that paid for itself was the pre-flight describe-vs-source diff above, and everything downstream of that was careful execution. If you want a hand on a CiviCRM-to-Salesforce or any other legacy migration, that diff is the file we send first.

Key takeaway

On a CiviCRM to Salesforce migration the data is not lost in the failed rows; it is lost in the successful ones. Run a length audit before every batch.

FAQ

How long does a CiviCRM to Salesforce migration take?

For 31,000 members with custom memberships and contributions, plan eight to twelve weeks end to end. Six of those weeks are field mapping and reconciliation, not actual data movement.

Does Salesforce Data Loader detect string truncation?

No. It reports the row as successful even when a long string is silently cut to fit the target field length. A pre-flight describe-vs-source diff is the only reliable check.

Do we need NPSP for a membership-based migration?

If the source has relationships, households, and recurring contributions, NPSP saves months of custom-object work. If it is a flat contact list, standard Salesforce is enough.

What is the most overlooked field on the import?

External ID. Add a unique, indexed external ID column to every object you import. Without it the second-pass import for relationships, memberships, and activities has no anchor.

migrationdrupallegacy sitesmysqlcase studyarchitecture

Building something?

Start a project