Visible to Intel only — GUID: GUID-437E992F-E309-4E29-8F27-EEBE0A05565A
Visible to Intel only — GUID: GUID-437E992F-E309-4E29-8F27-EEBE0A05565A
DBSCAN
Density-based spatial clustering of applications with noise (DBSCAN) is a data clustering algorithm proposed in [Ester96]. It is a density-based clustering non-parametric algorithm: given a set of observations in some space, it groups together observations that are closely packed together (observations with many nearby neighbors), marking as outliers observations that lie alone in low-density regions (whose nearest neighbors are too far away).
Operation |
Computational methods |
Programming Interface |
||
Default method |
Mathematical formulation
Computation
Given the set of np-dimensional feature vectors (further referred as observations), a positive floating-point number epsilon and a positive integer minObservations, the problem is to get clustering assignments for each input observation, based on the definitions below [Ester96]: two observations x and y are considered to be in the same cluster if there is a core observationz, and x and y are both reachable from z.
Each cluster gets a unique identifier, an integer number from 0 to . Each observation is assigned an identifier of the cluster it belongs to, or -1 if the observation considered to be a noise observation.
Programming Interface
Refer to API Reference: DBSCAN.
Distributed mode
The algorithm supports distributed execution in SPMD mode (only on GPU).
Usage Example
Compute
void run_compute(const table& data, const table& weights) { double epsilon = 1.0; std::int64_t max_observations = 5; const auto dbscan_desc = kmeans::descriptor<float>{epsilon, max_observations} .set_result_options(dal::dbscan::result_options::responses); const auto result = compute(dbscan_desc, data, weights); print_table("responses", result.get_responses()); }
Examples
oneAPI DPC++
Batch Processing:
oneAPI C++
Batch Processing: