Published on

Data Import/Export Using Firebase and MeiliSearch in React via Cloud Functions

Authors

In most web apps, having the functionality of importing/exporting is a nice-to-have feature. Firebase, a Backend as a Service solution, facilitates easy implementation of the import/export feature to our web app or SaaS via "Firebase Cloud Functions". If your SaaS requires a search functionality, which Firebase lacks, you can resort to the open-source solution, MeiliSearch. It offers a user-friendly API.

Following this approach, we develop two HTTP Cloud Functions - the import and export functions.

The import function retrieves all data parsed from the frontend as an array of JSON objects, verifies their validity, and populates the specified collection in Firebase and the corresponding index in MeiliSearch.

Conversely, the export function, considering MeiliSearch as the source of truth synchronized with Firebase, fetches all documents from MeiliSearch for the specific index. This process also mitigates costs incurred by Firebase data reads, which can be substantial depending on the collection size. Here's an example of what the code might look like:

let limit = 2000
let offset = 0
let hasMore = true
const documents: any[] = []
do {
  let currentDocuments = await client.index(indexId).getDocuments({ offset, limit })
  if (currentDocuments.length === 0) {
    hasMore = false
  }
  currentDocuments.forEach(async (doc) => {
    docs.push(doc)
  })
  offset += limit
} while (hasMore)

The limit we impose is not strict and can be changed in meilisearch, but it's good to configure the index to a lower number of limit to protect from malicious activity. The implementation of the import function, specific to the domain of the application, is demonstrated below. Here, we presume that our SaaS user is part of an enterprise within our SaaS application and wishes to import data from his registered customers.

const importFunction = functions.region('europe-west3').https.onCall(async (data, _) => {
  let client = MeilisearchService.getInstance()
  const writeBatch = admin.firestore().batch()

  const response = {
    succeed: true,
    code: 'success',
    message: '',
  }

  try {
    const customers = data.rows as any[]
    const enterprise = data.enterpriseId as string
    let customersJson: any[] = []

    customers.forEach((customer) => {
      customersJson.push({
        ...customer,
        timestamp: new Date().getTime(),
        disabled: false,
        customer_id: '_' + customer['phoneNumber'].replace('+', '_'),
      })
      const customerRef = admin
        .firestore()
        .collection('enterprises')
        .doc(enterprise)
        .collection('customers')
        .doc(customer['phoneNumber'])
      writeBatch.set(
        customerRef,
        {
          ...customer,
          timestamp: new Date().getTime(),
          disabled: false,
          customer_id: '_' + customer['phoneNumber'].replace('+', '_'),
        },
        { merge: true }
      )
    })
    await writeBatch.commit()
    // meilisearch will add documents in batches and resolve the promise
    // until the operation on last batch is completed successfully
    await client.index(enterprise).addDocumentsInBatches(customersJson, 20)
    return response
  } catch (error) {
    functions.logger.error(error)
    response.succeed = false
    response.code = 'error-server'
    response.message = `Unexpected error.`
    return response
  }
})

export default importData

The export function, which queries the desired collection, is implemented as follows:

const exportData = functions.region('europe-west3').https.onCall(async (data, _) => {
  let client = MeilisearchService.getInstance()

  const response = {
    url: '',
    succeed: true,
    code: 'success',
    message: '',
  }

  const docs: any[] = []
  try {
    const enterprise = data.enterpriseId as string
    const customersObjectRef = await admin
      .firestore()
      .collection('enterprise')
      .doc(enterprise)
      .collection('customers')
      .get()

    const enterpriseObject = await enterpriseObjectRef.data()

    if (!enterpriseObject) {
      throw new Error('enterprise not fouund')
    }

    const bucket = admin.storage().bucket()

    const file = `exports/${enterprise}_${getDate()}_export.csv`
    let limit = 2000
    let offset = 0
    let hasMore = true
    do {
      let documents = await client.index(enterprise).getDocuments({ offset, limit })
      if (documents.length === 0) {
        hasMore = false
      }
      documents.forEach(async (doc) => {
        docs.push(doc)
      })
      offset += limit
    } while (hasMore)

    // we have all docs here populate the csv
    let csv = ''
    enterpriseObject.fields.forEach((field: string) => {
      csv += `${field},`
    })
    csv += '\n'

    docs.forEach((doc) => {
      ;(enterpriseObject as any).fields.forEach((field: string) => {
        if (field in doc) {
          csv += `${doc[field]},`
        } else {
          csv += ','
        }
      })
      csv += '\n'
    })
    const uploadReasult = await bucket.file(file).save(csv)

    // create a presigned url for the function

    const [url] = await bucket.file(file).getSignedUrl({
      action: 'read',
      expires: Date.now() + 1000 * 60 * 60, // one hour
      version: 'v2', // defaults to 'v2' if missing.
    })

    response.url = url
    return response
  } catch (error) {
    functions.logger.error(error)
    response.succeed = false
    response.code = 'error-server'
    response.message = `Unexpected error.`
    return response
  }
})

Take note, for generating a pre-signed URL for the object uploaded in cloud-storage, the service account associated with the cloud function should possess the iam.serviceAccounts.signBlob permission. This can be assigned via the Google Cloud Console using this link: enable iam.serviceAccounts.signBlob permission in GCP permission in GCP.

On the frontend, we could have a straightforward component that manages two functions: onImportDataClick and onExportDataClick. The structure of this component appears as follows:

const ImportExport = (props: any) => {
  // State to store parsed data
  const [parsedData, setParsedData] = useState([])

  //State to store table Column name
  const [tableRows, setTableRows] = useState<string[]>([])

  //State to store the values
  const [values, setValues] = useState<any[]>([])
  const [customers, setCustomers] = useState<IDictionary<any>[]>([])
  const { error, handleError, resetErrorAnd, getError } = useTranslatedError()
  const [isLoading, setIsLoading] = useState(false)
  const { user } = useContext(UserContext)

  const changeHandler = async (event: any) => {
    // Passing file data (event.target.files[0]) to parse using Papa.parse

    const file = event.target.files[0]

    Papa.parse(file, {
      header: true,
      skipEmptyLines: true,
      complete: function (results: any) {
        let rowsArray: string[][] = []
        let valuesArray: any[] = []

        // Iterating data to get column name and their values
        results.data.map((d: any) => {
          rowsArray.push(Object.keys(d))
          valuesArray.push(Object.values(d))
        })

        // checking the column names if they are the same as  in the data
        // model of our collection
        if (!rowsArray[0].includes('phoneNumber')) {
          handleError(new CustomError('Please provide a file with a phoneNumber as column', ''))
          return
        }
        if (!rowsArray[0].includes('firstName')) {
          handleError(new CustomError('Please provide a file with a firstName as column', ''))
          return
        }
        if (!rowsArray[0].includes('lastName')) {
          handleError(new CustomError('Please provide a file with a lastName as column', ''))
          return
        }
        // Parsed Data Response in array format
        setParsedData(results.data)

        // Filtered Column Names
        if (valuesArray.map((val) => val[0]).every((row) => row === '')) {
          valuesArray = valuesArray.map((val) => val.slice(1))
        }
        if (valuesArray.map((val) => val[0]).every((row) => row === '')) {
          valuesArray = valuesArray.map((val) => val.slice(1))
        }
        let rows = rowsArray[0]
        if (rows[0] == '') {
          rows = rows.slice(1)
        }
        // Filtered Values
        setTableRows(rows)

        setValues(valuesArray)

        let customers: any[] = []
        // code to map the csv rows to an customers
        // array which is the data for import
        valuesArray.forEach((value) => {
          let customer = {}
          value.map((v: any, i: number) => {
            ;(customer as any)[rows[i]] = v
          })
          customers.push(customer)
        })
        setCustomers(customers)
      },
    })
  }

  const onExportDataClick = () => {
    // to be defined
  }
  const onImportDataClick = async () => {
    // to be defined
  }

  return (
    <div>
      <input
        type="file"
        name="file"
        onChange={changeHandler}
        accept=".csv"
        style={{ display: 'block', margin: '10px auto' }}
      />
      <div className="mb-4 w-1/3">
        <Button onClick={onExportDataClick}>{t('EXPORT')}</Button>
      </div>
      {values.length > 9 && (
        <div className="mb-4 w-1/3">
          <Button onClick={onImportDataClick}>{t('IMPORT')}</Button>
        </div>
      )}
      {isLoading ? (
        <Spinner />
      ) : (
        <>
          <table>
            <thead>
              <tr>
                {tableRows.map((rows: any, index: any) => {
                  return <th key={index}>{rows}</th>
                })}
              </tr>
            </thead>
            <tbody>
              {values
                .slice(0, Math.min(100, Math.floor(values.length / 2)))
                .map((value: any, index: any) => {
                  return (
                    <tr key={index}>
                      {value.map((val: any, i: any) => {
                        return <td key={i}>{val}</td>
                      })}
                    </tr>
                  )
                })}
            </tbody>
          </table>
          <div>
            {values.length > 0 && (
              <p>
                Showing {Math.min(100, Math.floor(values.length / 2))}/ {values.length}
              </p>
            )}
          </div>
        </>
      )}
    </div>
  )
}

To parse the CSV file, we utilize a library called PapaParse. With PapaParse, we can extract the columns and rows from the CSV on the complete callback and validate that the data aligns with our target array.

The implementation for integrating and invoking the dataExport function within the onExportDataClick handlers is as follows:

const handleExport = () => {
  setIsLoading(true)
  Firebase.getFunctions()
    .httpsCallable('exportData')({
      enterpriseId: user.enterprise,
    })
    .then(({ data }) => {
      if (!data.succeed) {
        throw new Error('something bad happened')
      } else {
        // open the presigned url in a new tah
        window.open(data.url, '_blank')
      }
    })
    .catch(handleError)
    .finally(() => setIsLoading(false))
}

The implementation for onImportDataClick is slightly more complex as Firebase limits the number of rows that can be written using writeBatch in one commit to only 500. Hence, we address this limitation on the frontend side by dividing the array mapped from CSV rows into chunks of 500 each, using the lodash library. The code for this function looks like this:

const handleImport = async () => {
  if (customers.length > 20000) {
    throw new Error('the maximum limit is 20k rows')
  }
  setIsLoading(true)
  let customerChunks = _.chunk(customers, 500)
  try {
    for (let index = 0; index < customerChunks.length; index++) {
      const chunk = customerChunks[index]
      const { data } = await Firebase.getFunctions().httpsCallable('dataImport')({
        rows: chunk,
        enterpriseId: user.enterprise,
      })
      if (!data.succeed) {
        throw new Error('something bad happened')
      }
    }
    // all data imported successfully redirect to the table page
    props.history.push(ROUTES[MenuOptions.CUSTOMERS])
  } catch (e) {
    handleError(e)
  } finally {
    setIsLoading(false)
  }
}

I hope you found this post helpful. Peace out ✌️ and keep coding 👨‍💻